Closed sakibmatin closed 2 months ago
Since this change will break existing models I think we need to be careful about merging it when the feature is closer to ready - so that we can do so in a way that hopefully does not break models that have already been trained under a smaller regularization parameter.
It makes sense not to merge this now.
The user should be able to set the regularization at training. One possible implementation is have the regularization as part of the module_kwargs
. I can update the code such that the regularization value should be stored in the module's buffer and can be loaded from the state_dict
.
An issue is that existing models do not have the regularization value stored in the buffer. When reloading an existing model, if the regularization is not available, the fallback value will be 1e-30
. We can put up a DeprecationWarning
informing the user.
Let me what do you think. Thanks.
Sounds like a good plan!
Some comments on the changes.
User-facing: when training a model, there is a new option kwarg for network_params
, namely cusp_reg
, which corresponds to the cusp regularization term for HipnnVec
and HipnnQuad
If the user is loading an older model (without optional cusp_reg
), then they will get get a warning. The old model will still work because of the backward_compatibility
function, which mutates the model graph in place to add the missing cusp_reg
attribute.
This has been tested with the Predictor, ASE and lammps calculators.
@sakibmatin I changed how the backwards compatibility is implemented so that it should reflect any type of unpickling process, and so that the cusp regularization parameter is stored as part of the module state dictionary. It works on ASE on my end, do you have any concerns or want to run any further tests before merging?
Looks good to me. I've re-run my tests with your commits, and everything looks good to me. I think this is ready to be merged in.
The regularization term for the hipnnvec and hipnnquad are increased to
1e-6
. The original regularization was too small, leading to unexpected force predictions.In a separate Pull Request, we can update the interface so that this regularization value can be set by the user at training time.