Closed vvassilevg closed 4 years ago
The positions MUST be in Angstroms, otherwise the force (in eV/A) is not the precise derivative of the energy. Make sure the forces are really forces and not gradients (as given by some quantum chemistry packages).
On your soap command:
(n_max,l_max) = (14,14) is complete overkill, and also slows everything down. set it to (12,6) to get high accuracy results, (8,4) for something quicker
you need to add the "add_species=T" command, otherwise all atomic species are ignored
This is further down, once you get the accuracy that you want, but if you want a potential that can be used for high temperature MD, you will need to do something to enforce atomic repulsion at close approach. Either an explicitly fitted 2b potential (can be added to the gap string, look at published papers), or some other baseline. In this case I recommend not using the "e0_method=average" command, but actually explicitly compute the isolated atom energies and add them to the training set (they will be picked up and used for e0 for each species)
Let me know how you get on!
-- Gábor
Gábor Csányi Professor of Molecular Modelling Engineering Laboratory, University of Cambridge Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 26 Jun 2020, at 13:00, vvassilevg notifications@github.com wrote:
I would like to fit with GAP the PES of molecules using SOAP. As a test, I am using a glycine molecule (500 training points). So far, I get high Mean Absolute Errors on the training set (the errors are of course even higher for the test set) for both Energies and Forces (above 0.07 eV and 0.7 ev/A, respectively).
I have tested different parameters for the gap fit command. One example is:
gap_fit at_file=train.xyz \ gap={soap cutoff=5.0 \ covariance_type=dot_product \ zeta=2 \ delta=0.016 \ atom_sigma=0.3 \ l_max=14 \ n_max=14 \ n_sparse=2000 \ sparse_method=cur_points} \ force_parameter_name=forces \ e0_method=average \ default_sigma={0.001 0.2 0.0 0.0} \ do_copy_at_file=F sparse_separate_file=F \ gp_file=gap_soap.xml
I have tried different values for l_max, n_max and cutoff.
An example of a molecule in my train.xyz files is:
10 Lattice="200.0 0.0 0.0 0.0 200.0 0.0 0.0 0.0 200.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=-7735.046780 pbc="T T T" N -5.400440 5.468773 2.837348 0.197830 0.001017 0.374624 H -6.365580 4.176578 3.913868 0.136175 0.068701 -0.293331 H -3.910505 6.045725 3.925944 -0.120680 0.003343 -0.049358 C -4.498210 4.417443 0.471706 -0.433094 1.062071 -0.682114 C -2.707543 2.205386 0.405318 0.102343 -0.550705 -0.104142 O -1.920056 1.231402 -1.521152 -0.009198 0.153232 0.169461 O -2.067417 1.333725 2.758112 -0.292697 0.513378 -0.084358 H -0.893916 -0.033471 2.454792 0.283477 -0.392345 -0.057309 H -6.148205 4.006446 -0.739463 0.074871 -0.442791 0.154181 H -3.631109 5.954848 -0.713001 0.060973 -0.415900 0.572343
Energy is in eV, forces in eV/A, and in this case, positions are in a.u., but I have also trained using A.
Is there anything missing (or wrong) when setting the gap_fit command and the soap descriptor?
I will be very grateful for any help you can provide.
Best regards,
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Dear Prof. Csányi,
Thank you for your message.
I already set my gap fit input with your suggested values of (n_max, l_max) and with "add_species=T", and I also set all positions in A. However, the results I am obtaining are practically the same (only forces improve a little).
I think that my forces are not gradients. So, do you think that there is something else missing in my input?
Best regards.
Show me your new command line, and the scatter plot of target vs predicted energies, and target vs predicted force components, the latter coloured according to the element
-- Gábor
Gábor Csányi Professor of Molecular Modelling Engineering Laboratory, University of Cambridge Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 29 Jun 2020, at 12:16, vvassilevg notifications@github.com wrote:
Dear Prof. Csányi,
Thank you for your message.
I already set my gap fit input with your suggested values of (n_max, l_max) and with "add_species=T", and I also set all positions in A. However, the results I am obtaining are practically the same (only forces improve a little).
I think that my forces are not gradients. So, do you think that there is something else missing in my input?
Best regards.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
The command is:
gap_fit at_file=train.xyz \
gap={soap cutoff=5.0 \
covariance_type=dot_product \
zeta=2 \
delta=0.016 \
atom_sigma=0.3 \
add_species=T \
n_max=8 \
l_max=4 \
n_sparse=4000 \
sparse_method=cur_points} \
force_parameter_name=forces \
e0_method=average \
default_sigma={0.001 0.2 0.0 0.0} \
do_copy_at_file=F sparse_separate_file=F \
gp_file=gap_soap.xml
The plots are at the end of the message.
Do you think that (n_max, l_max) = (12,6) could make a huge improvement, or Is it possible that I would need to add the 2b or 3b potentials?
Why are you using such a small delta? it is supposed to be typical energy per atom (really: target function value for the descriptor, but here that is energy per atom), which I would expect to be about 0.1 eV, and you have a 1/10th of that. I wouldn’t expect the n_max,l_max to be your limiting factor.
-- Gábor
Gábor Csányi Professor of Molecular Modelling Engineering Laboratory, University of Cambridge Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 29 Jun 2020, at 17:09, vvassilevg notifications@github.com wrote:
The command is:
gap_fit at_file=train.xyz \ gap={soap cutoff=5.0 \ covariance_type=dot_product \ zeta=2 \ delta=0.016 \ atom_sigma=0.3 \ add_species=T \ n_max=8 \ l_max=4 \ n_sparse=4000 \ sparse_method=cur_points} \ force_parameter_name=forces \ e0_method=average \ default_sigma={0.001 0.2 0.0 0.0} \ do_copy_at_file=F sparse_separate_file=F \ gp_file=gap_soap.xml
The plots are at the end of the message.
Do you think that (n_max, l_max) = (12,6) could make a huge improvement, or Is it possible that I would need to add the 2b or 3b potentials?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
I actually think the delta should not matter since there's only one GAP, except for the relative ratio of regularization to delta which might be smudging the predictions (otherwise the alphas just get scaled). I would guess based on experience that the ratio atom_sigma to rcut might be too small for only 8 radial basis functions to resolve accurately. I would increase nmax or even better, increase atom_sigma. And increase the delta as Gábor suggested.
On Mon, 29 Jun 2020, 19:55 gabor1, notifications@github.com wrote:
Why are you using such a small delta? it is supposed to be typical energy per atom (really: target function value for the descriptor, but here that is energy per atom), which I would expect to be about 0.1 eV, and you have a 1/10th of that. I wouldn’t expect the n_max,l_max to be your limiting factor.
-- Gábor
Gábor Csányi Professor of Molecular Modelling Engineering Laboratory, University of Cambridge Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 29 Jun 2020, at 17:09, vvassilevg notifications@github.com wrote:
The command is:
gap_fit at_file=train.xyz \ gap={soap cutoff=5.0 \ covariance_type=dot_product \ zeta=2 \ delta=0.016 \ atom_sigma=0.3 \ add_species=T \ n_max=8 \ l_max=4 \ n_sparse=4000 \ sparse_method=cur_points} \ force_parameter_name=forces \ e0_method=average \ default_sigma={0.001 0.2 0.0 0.0} \ do_copy_at_file=F sparse_separate_file=F \ gp_file=gap_soap.xml
The plots are at the end of the message.
Do you think that (n_max, l_max) = (12,6) could make a huge improvement, or Is it possible that I would need to add the 2b or 3b potentials?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/libAtoms/QUIP/issues/213#issuecomment-651241613, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADR3Q2ECT5QEDEIKJPWOQRDRZDBRXANCNFSM4OJIKUSQ .
I think the delta matters because there are both energies and forces... it's best to try to stick to the heuristic. I agree that there is a rescaling of both delta and sigma that probably leaves things invariant...
n=8,l=4 is for a crude accuracy, but not this crude...
And the fact that the forces are not around the x=y line is the really troublesome thing
Yes, you are right about the deltas. I still think the atom_sigma is too small for that cutoff. Compare 0.5/3.7 for the a-C GAP to 0.3/5 for this one. Roughly twice as small and with the same number of radial functions. I think that could make for a noisy kernel.
On Mon, 29 Jun 2020, 20:37 gabor1, notifications@github.com wrote:
And the fact that the forces are not around the x=y line is the really troublesome thing
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/libAtoms/QUIP/issues/213#issuecomment-651261913, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADR3Q2AS54NX7QTGUX4STZDRZDGLZANCNFSM4OJIKUSQ .
In general you are right Miguel, but the problems here seem to me bigger.
Thank you both for your comments.
I have tested different values of delta and the best result I have got so far is with 0.25: Energy MAE -- 0.005587 eV Force MAE -- 0.132692 eV/A
Then I kept delta=0.25 and increased atoms_sigma to 0.5, the errors are practically the same, but slightly worse: Energy MAE -- 0.006723 eV Force MAE -- 0.143231 eV/A
Do you think I can improve the accuracy of forces with even higher values of delta and/or using (n_max,l_max) = (12,6)?
Best regards,
Your force errors look much much better. you should think carefully whether these are good enough for your purposes.. you are likely hitting the locality limit. no solutions are easy to make the descriptors longer range (you can use multiple soaps, but in any case you might need a lot more data etc). maybe time to compute some observable that is more directly related to what you want and see if it is good enough?
-- Gábor
Gábor Csányi Professor of Molecular Modelling Engineering Laboratory, University of Cambridge Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 30 Jun 2020, at 15:28, vvassilevg notifications@github.com wrote:
Thank you both for your comments.
I have tested different values of delta and the best result I have got so far is with 0.25: Energy MAE -- 0.005587 eV Force MAE -- 0.132692 eV/A
Then I kept delta=0.25 and increased atoms_sigma to 0.5, the errors are practically the same, but slightly worse: Energy MAE -- 0.006723 eV Force MAE -- 0.143231 eV/A
Do you think I can improve the accuracy of forces with even higher values of delta and/or using (n_max,l_max) = (12,6)?
Best regards,
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
Indeed, it looks a lot better. It seems that I was mistaken about both deltas and the atom_sigma...
On Tue, 30 Jun 2020 at 17:31, gabor1 notifications@github.com wrote:
Your force errors look much much better. you should think carefully whether these are good enough for your purposes.. you are likely hitting the locality limit. no solutions are easy to make the descriptors longer range (you can use multiple soaps, but in any case you might need a lot more data etc). maybe time to compute some observable that is more directly related to what you want and see if it is good enough?
-- Gábor
Gábor Csányi Professor of Molecular Modelling Engineering Laboratory, University of Cambridge Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 30 Jun 2020, at 15:28, vvassilevg notifications@github.com wrote:
Thank you both for your comments.
I have tested different values of delta and the best result I have got so far is with 0.25: Energy MAE -- 0.005587 eV Force MAE -- 0.132692 eV/A
Then I kept delta=0.25 and increased atoms_sigma to 0.5, the errors are practically the same, but slightly worse: Energy MAE -- 0.006723 eV Force MAE -- 0.143231 eV/A
Do you think I can improve the accuracy of forces with even higher values of delta and/or using (n_max,l_max) = (12,6)?
Best regards,
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/libAtoms/QUIP/issues/213#issuecomment-651830991, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADR3Q2CCFAYNZIIHHQWZ5ELRZHZLJANCNFSM4OJIKUSQ .
-- Dr. Miguel Caro
Academy of Finland Postdoctoral Researcher Department of Electrical Engineering and Automation and Department of Applied Physics Aalto University http://www.aalto.fi, Finland Email: mcaroba@gmail.com Work: miguel.caro@aalto.fi Website: miguelcaro.org
I guess now that I know how to tune the parameters I can test the models to see how they perform and also work with other systems.
Thank you for your help with this issue.
Best regards,
I would like to fit with GAP the PES of molecules using SOAP. As a test, I am using a glycine molecule (500 training points). So far, I get high Mean Absolute Errors on the training set (the errors are of course even higher for the test set) for both Energies and Forces (above 0.07 eV and 0.7 ev/A, respectively).
I have tested different parameters for the gap fit command. One example is:
I have tried different values for l_max, n_max and cutoff.
An example of a molecule in my train.xyz files is:
10 Lattice="200.0 0.0 0.0 0.0 200.0 0.0 0.0 0.0 200.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=-7735.046780 pbc="T T T" N -5.400440 5.468773 2.837348 0.197830 0.001017 0.374624 H -6.365580 4.176578 3.913868 0.136175 0.068701 -0.293331 H -3.910505 6.045725 3.925944 -0.120680 0.003343 -0.049358 C -4.498210 4.417443 0.471706 -0.433094 1.062071 -0.682114 C -2.707543 2.205386 0.405318 0.102343 -0.550705 -0.104142 O -1.920056 1.231402 -1.521152 -0.009198 0.153232 0.169461 O -2.067417 1.333725 2.758112 -0.292697 0.513378 -0.084358 H -0.893916 -0.033471 2.454792 0.283477 -0.392345 -0.057309 H -6.148205 4.006446 -0.739463 0.074871 -0.442791 0.154181 H -3.631109 5.954848 -0.713001 0.060973 -0.415900 0.572343
Energy is in eV, forces in eV/A, and in this case, positions are in a.u., but I have also trained using A.
Is there anything missing (or wrong) when setting the gap_fit command and the soap descriptor?
I will be very grateful for any help you can provide.
Best regards,