Closed cganley2 closed 3 years ago
Hi @cganley2
Because MGP uses spline interpolation to approximate GP prediction, the interpolation is done in an interval [a, b], where the upper bound b is set to the GP cutoff, and the lower bound a is set to be train_dist - mgp_model.grid_params['lower_bound_relax']
, where train_dist
is the smallest interatomic distance in the training set.
Outside the interval, the spline function evaluation could be largely deviated from GP, and the accuracy of the interpolation is not guaranteed. Therefore, we set up the ERROR in lammps if there are two atoms get closer to each other than the interpolation lower bound.
To solve this, you can try increasing mgp_model.grid_params['lower_bound_relax']
and re-build the MGP. The default of lower_bound_relax is 0.1, see
https://flare.readthedocs.io/en/latest/flare/mgp/mgp.html
A related comment: https://github.com/mir-group/flare/issues/267#issuecomment-742959173
This very helpful, thanks so much.
I apologize for reopening this, but I have a few more questions.
Hi @cganley2
My apology for this late response. For your questions:
You can pick up the frame that raises the error in your MD, and compute the distances between atoms. You should be able to read the frame via ASE I suppose. And you can check the lower bound in the pair style coefficient file, for example for 2-body:
C Si 1.355769910136993 4.0 32
means in the potential the C-Si bond is treated with lower bound 1.356 and upper bound 4.0 with 32 interpolation grids in the interval. And for 3-body
C C C 1.355769910136993 1.355769910136993 1.355769910136993 4.0 4.0 4.0 16 16 16
it is "element1 element2 element3 lower_bound1 lower_bound2 lower_bound3 upper_bound1 upper_bound2 upper_bound3 grid1 grid2 grid3"
If you want to augment the training set, one easy way is to increase your training temperature, and the exploration of phase space will be more sufficient.
Yes, the lower bound can be definitely set to 0, for example if you set mgp_model.grid_params['lower_bound_relax']
to be a large value, such that min_distance - lower_bound_relax < 0
, then the lower_bound = max(min_distance - lower_bound_relax, 0)
is truncated at 0.
However, notice that here we are doing spline interpolation, which uses finite grid points. The finer the grid is, the more accurate the interpolation will be. But with more grids, the computational cost is more expensive. Therefore, to reduce the number of grids while keeping the accuracy, we want to keep the interval for interpolation as small as possible, such that a small number of grid is also a fine mesh. It is absolutely Ok to use 0 as the lower bound, and you have to see if you need to increase the grid number to reach the same accuracy as before.
Describe the bug Not a bug, per se, but I am requesting a solution to the error mentioned in the title.
To Reproduce Steps to reproduce the behavior: I have an admittedly unique system from which I have trained an MGP from AIMD data and would now like to simulate in LAMMPS. Using the input script:
the program returns:
Expected behavior Ideally, LAMMPS would utilize the MGP to calculate the forces within the NVT ensemble to allow things to equilibrate, at which time I would run another ensemble for a much longer time.
Additional context The starting configuration of the system I am studying was taken from the last frame of the AIMD run, so I do not believe the error arises as a result of a strange configuration, or one that the GP would not have seen yet. However, would the GP need to be trained on more data to not encounter this error? Also, in the output file, there are no lines indicating that the 2b potential is read from the *.mgp file. Is this a problem? In short, what is the physical meaning, in terms of the model, when the error mentioned in the title occurs, and how do I address it? Thanks.