mir-group / flare

An open-source Python package for creating fast and accurate interatomic potentials.
https://mir-group.github.io/flare
MIT License
292 stars 71 forks source link

Looking for solutions for "ERROR: MGP lower bound: The interatomic distance is smaller than the 3b lower bound. (../pair_mgp.cpp:382)"? #284

Closed cganley2 closed 3 years ago

cganley2 commented 3 years ago

Describe the bug Not a bug, per se, but I am requesting a solution to the error mentioned in the title.

To Reproduce Steps to reproduce the behavior: I have an admittedly unique system from which I have trained an MGP from AIMD data and would now like to simulate in LAMMPS. Using the input script:

units metal
atom_style atomic
newton off

read_data PNDIClTVT.data

mass 1 12.011
mass 2 35.453
mass 3 18.998
mass 4 1.008
mass 5 14.007
mass 6 22.990
mass 7 15.999
mass 8 32.067
pair_style mgp
pair_coeff * * aimd_PNDIClTVT.mgp C Cl F H N Na O S yes yes

velocity all create 300 3829570 dist gaussian

dump 1 all xyz 1000 dump.xyz
dump_modify 1 element C Cl F H N Na O S
dump_modify 1 sort id

thermo 100
thermo_style   custom time density temp etotal ke pe vol xlo xhi ylo yhi zlo zhi press

timestep 0.001
fix ensemble all nve/limit 0.1
run 10000

the program returns:

LAMMPS (16 Mar 2018)
Reading data file ...
  orthogonal box = (0 0 0) to (16.4 12.202 7.85)
  1 by 1 by 1 MPI processor grid
  reading atoms ...
  92 atoms
MGP coeff: type 1 is element C
MGP coeff: type 2 is element Cl
MGP coeff: type 3 is element F
MGP coeff: type 4 is element H
MGP coeff: type 5 is element N
MGP coeff: type 6 is element Na
MGP coeff: type 7 is element O
MGP coeff: type 8 is element S
MGP read_file: reading potential parameters...
MGP read_file: the file contains 36 2b potential and 512 3b potential
MGP read_file: reading the 0 3b       potential
... _(ellipsis inserted for brevity)_
MGP read_file: reading the 511 3b    potential
Neighbor list info ...
  update every 100 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 11
  ghost atom cutoff = 11
  binsize = 5.5, bins = 3 3 2
  1 neighbor lists, perpetual/occasional/extra = 1 0 0
  (1) pair mgp, perpetual
      attributes: full, newton off
      pair build: full/multi
      stencil: full/multi/3d
      bin: standard
Setting up Verlet run ...
  Unit style    : metal
  Current step  : 0
  Time step     : 0.001
ERROR: MGP lower bound: The interatomic distance is smaller than the 3b lower bound. (../pair_mgp.cpp:382)
Last command: run 100

Expected behavior Ideally, LAMMPS would utilize the MGP to calculate the forces within the NVT ensemble to allow things to equilibrate, at which time I would run another ensemble for a much longer time.

Additional context The starting configuration of the system I am studying was taken from the last frame of the AIMD run, so I do not believe the error arises as a result of a strange configuration, or one that the GP would not have seen yet. However, would the GP need to be trained on more data to not encounter this error? Also, in the output file, there are no lines indicating that the 2b potential is read from the *.mgp file. Is this a problem? In short, what is the physical meaning, in terms of the model, when the error mentioned in the title occurs, and how do I address it? Thanks.

YuuuXie commented 3 years ago

Hi @cganley2

Because MGP uses spline interpolation to approximate GP prediction, the interpolation is done in an interval [a, b], where the upper bound b is set to the GP cutoff, and the lower bound a is set to be train_dist - mgp_model.grid_params['lower_bound_relax'], where train_dist is the smallest interatomic distance in the training set.

Outside the interval, the spline function evaluation could be largely deviated from GP, and the accuracy of the interpolation is not guaranteed. Therefore, we set up the ERROR in lammps if there are two atoms get closer to each other than the interpolation lower bound.

To solve this, you can try increasing mgp_model.grid_params['lower_bound_relax'] and re-build the MGP. The default of lower_bound_relax is 0.1, see https://flare.readthedocs.io/en/latest/flare/mgp/mgp.html

A related comment: https://github.com/mir-group/flare/issues/267#issuecomment-742959173

cganley2 commented 3 years ago

This very helpful, thanks so much.

cganley2 commented 3 years ago

I apologize for reopening this, but I have a few more questions.

  1. Is there a way to see which interatomic distance in the dynamics is the source of the error?
  2. Do you have recommendations about how to either augment the training set such that it includes atomic distances that are smaller than what it has seen before?
  3. In a more general sense, what is the purpose of having the lower bound of the spline interpolation of the GP? Why not just set the lower bound to 0, for example?
YuuuXie commented 3 years ago

Hi @cganley2

My apology for this late response. For your questions:

  1. You can pick up the frame that raises the error in your MD, and compute the distances between atoms. You should be able to read the frame via ASE I suppose. And you can check the lower bound in the pair style coefficient file, for example for 2-body:

    C Si 1.355769910136993 4.0 32

    means in the potential the C-Si bond is treated with lower bound 1.356 and upper bound 4.0 with 32 interpolation grids in the interval. And for 3-body

    C C C 1.355769910136993 1.355769910136993 1.355769910136993 4.0 4.0 4.0 16 16 16

    it is "element1 element2 element3 lower_bound1 lower_bound2 lower_bound3 upper_bound1 upper_bound2 upper_bound3 grid1 grid2 grid3"

  2. If you want to augment the training set, one easy way is to increase your training temperature, and the exploration of phase space will be more sufficient.

  3. Yes, the lower bound can be definitely set to 0, for example if you set mgp_model.grid_params['lower_bound_relax'] to be a large value, such that min_distance - lower_bound_relax < 0, then the lower_bound = max(min_distance - lower_bound_relax, 0) is truncated at 0.

    However, notice that here we are doing spline interpolation, which uses finite grid points. The finer the grid is, the more accurate the interpolation will be. But with more grids, the computational cost is more expensive. Therefore, to reduce the number of grids while keeping the accuracy, we want to keep the interval for interpolation as small as possible, such that a small number of grid is also a fine mesh. It is absolutely Ok to use 0 as the lower bound, and you have to see if you need to increase the grid number to reach the same accuracy as before.