LAMMPS simulation not starting when vacuum in structure

VictorBrouwers commented 2 years ago

I'm modelling perovskite structures, trained with FLARE++ and then mapped to LAMMPS. This generally works wonderfully well, so a big thank you to making this code openly available in that sense :).

My process

I train a bulk perovskite structure in FLARE++.
I map the model to LAMMPS.
I run LAMMPS simulations for longer timescales. [4. I run SGP simulations for uncertainty checks and uncertainty visualization.]

My problem The process outlined above usually works fine. However, recently, I've started to simulate surfaces with the bulk trained model (to check transferability). Here, I run into issues.

My LAMMPS simulation with the surface structure (so including vacuum) does not start, and it does not produce any errors. This is the case for both NpT and NVT simulations.
The exact same model IS able to run an SGP simulation (in both NpT and NVT).
I've tried to train a model (in NVT) including vacuum and map this, but this does not solve my problem.

Screenshots This is an image of my surface material (called "cub_I_LARGE") in the attached ZIP-folder. Again, this does not work in LAMMPS, while it does work with just the SGP model in Python.

This is an image of my bulk material (in a different, less symmetric phase). Simulating this works perfectly fine.

Desired outcome I'd like to use a mapped model to run large-scale LAMMPS simulations with surfaces. If I understand correctly, you are also able to do this in your FLARE++ paper for your hydrogen-platinum catalysis.

Additional context

LAMMPS has been installed according to https://github.com/mir-group/flare_pp/blob/master/lammps_plugins/README.md.
FLARE++ version 0.0.24 is installed.
I've uploaded a ZIP-file to my Drive with my files for potential reproduction. It contains:

LAMMPS_bulk: the working LAMMPS simulation
LAMMPS_surface: the not-working LAMMPS simulation with both the bulk-trained and surface-trained systems
SGP_I_surface: the SGP simulations that dó work with the bulk-trained system and the surface-training system.

The ZIP file can be downloaded from here: https://drive.google.com/file/d/1e5g_ysL5Uu_VKz62Rd4_lO5E4EVZypeY/view?usp=sharing

I hope this somewhat clearly illustrates the problem I'm facing.

Kind regards, Victor Brouwers

anjohan commented 2 years ago

Hi Victor,

I've tried running your Surface_issue/LAMMPS_surface/Surface_Trained_NVT/lammps.in, and the atoms are moving for me - with and without Kokkos[*]. This is with the master branch of flare_pp.[**] My commands are

$HOME/lammps/build/lmp -in lammps.in 
LAMMPS (29 Sep 2021)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
  using 1 OpenMP thread(s) per MPI task
...
Step Time Temp PotEng KinEng TotEng Press Volume 
       0            0          600   -1328.6804    34.667571   -1294.0129    1824.9892    39847.344 
       1        0.001    599.90273   -1328.6751    34.661951   -1294.0132     1823.651    39847.344 
       2        0.002    599.70034   -1328.6644    34.650257   -1294.0141    1823.0908    39847.344 
       3        0.003    599.39306   -1328.6482    34.632503   -1294.0157    1823.3072    39847.344 
       4        0.004    598.98119   -1328.6266    34.608705   -1294.0179    1824.2977    39847.344 
       5        0.005    598.46509   -1328.5996    34.578885   -1294.0207    1826.0589    39847.344 
       6        0.006    597.84518   -1328.5672    34.543067   -1294.0241    1828.5866    39847.344 
       7        0.007    597.12195   -1328.5294     34.50128   -1294.0281    1831.8755    39847.344 
       8        0.008    596.29598   -1328.4863    34.453555   -1294.0327    1835.9193    39847.344 
       9        0.009    595.36788   -1328.4378    34.399931   -1294.0379    1840.7108    39847.344 
      10         0.01    594.33837   -1328.3841    34.340446   -1294.0436    1846.2419    39847.344 
...

$  $HOME/lammps/build/lmp -sf kk -k on t 12 -pk kokkos newton on neigh full -in lammps.in 
LAMMPS (29 Sep 2021)
KOKKOS mode is enabled (src/KOKKOS/kokkos.cpp:105)
  will use up to 0 GPU(s) per node
  using 12 OpenMP thread(s) per MPI task
...
Step Time Temp PotEng KinEng TotEng Press Volume 
       0            0          600   -1328.6804    34.667571   -1294.0129    1824.9892    39847.344 
       1        0.001    599.90273   -1328.6751    34.661951   -1294.0132     1823.651    39847.344 
       2        0.002    599.70034   -1328.6644    34.650257   -1294.0141    1823.0908    39847.344 
       3        0.003    599.39306   -1328.6482    34.632503   -1294.0157    1823.3072    39847.344 
       4        0.004    598.98119   -1328.6266    34.608705   -1294.0179    1824.2977    39847.344 
       5        0.005    598.46509   -1328.5996    34.578885   -1294.0207    1826.0589    39847.344 
       6        0.006    597.84518   -1328.5672    34.543067   -1294.0241    1828.5866    39847.344 
       7        0.007    597.12195   -1328.5294     34.50128   -1294.0281    1831.8755    39847.344 
       8        0.008    596.29598   -1328.4863    34.453555   -1294.0327    1835.9193    39847.344 
       9        0.009    595.36788   -1328.4378    34.399931   -1294.0379    1840.7108    39847.344 
      10         0.01    594.33837   -1328.3841    34.340446   -1294.0436    1846.2419    39847.344

Do your results differ? Are the forces on your atoms zero?

[*] Note that unless you are running with GPUs/CUDA, the non-Kokkos version will mostlikely be faster, since the Kokkos version has been optimized for GPUs in terms of memory layouts etc. [**] There's been a format change in the coefficient file, so you need to add the power (hopefully 2) on the second line of beta.txt

DATE: Thu Feb 10 15:53:24 2022 CONTRIBUTOR: YX
2
chebyshev
...

Best wishes, Anders log.lammps.txt traj.lammps.txt

VictorBrouwers commented 2 years ago

Hi Anders,

Thanks for your quick and clear reply! Running both a Kokkos and non-Kokkos version on just 1 core works for me. Extending to multiple cores on the non-Kokkos version works just fine, while this is not the case for the Kokkos version (and this is not a problem with my other systems for some reason).

I was not aware that the non-Kokkos version is (probably) faster when not running on GPUs, so thank you for notifying me! The non-Kokkos version also runs just fine on multiple cores, so you can consider my issue resolved!

Kind regards, Victor

anjohan commented 2 years ago

Hm, this is strange. I'll have a look later today. It could be that if you use a lot of MPI ranks, some of them end up without atoms. This should, of course, not be an issue, but it's not something I have ever tested. You could try the balance command (which will generally improve performance if there is vacuum in your system).

Otherwise, let me know if other issues show up!

mir-group / flare_pp

LAMMPS simulation not starting when vacuum in structure #36