libAtoms / QUIP

libAtoms/QUIP molecular dynamics framework: https://libatoms.github.io
351 stars 122 forks source link

The sign of stress and virial used in gap_fit training #227

Closed hsulab closed 4 years ago

hsulab commented 4 years ago

Dear libAtoms users and developers

Recently, I am working on the GAP training of the Pt bulk, where VASP serves as the DFT engine and ASE converts data.

As I follow the virial=stress*volume in doc, the training results is really terrible. However, when the negative values are used virial=-stress*volume, the error is very small. Thus, I am really confused with the sign of stress and virial used by ASE and gap_fit.

The training command used is gap_fit_omp at_file=train.xyz gp_file=GAP.xml do_copy_at_file=F sparse_separate_file=F energy_parameter_name=free_energy force_parameter_name=forces virial_parameter_name=virial default_sigma={0.004 0.04 0.04 0.0} gap={distance_2b cutoff=4.0 delta=0.8 theta_uniform=1.0 covariance_type=ard_se sparse_method=uniform n_sparse=100:angle_3b cutoff=4.0 delta=0.4 theta_uniform=1.0 covariance_type=ard_se sparse_method=uniform n_sparse=250:soap cutoff=4.0 delta=0.2 zeta=2 atom_sigma=0.7 l_max=6 n_max=6 covariance_type=dot_product sparse_method=cur_points n_sparse=250}

And the typical structures in train.xyz are

1
Lattice="10.0 0.0 0.0 0.0 10.0 0.0 0.0 0.0 10.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=0.0 free_energy=0.0 pbc="T T T"
Pt       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000       0.00000000
4
Lattice="3.1113 0.0 0.0 0.0 3.1113 0.0 0.0 0.0 3.1113" Properties=species:S:1:pos:R:3:forces:R:3 energy=34.0622 free_energy=34.0592 stress="-7.322 -0.0 -0.0 -0.0 -7.322 0.0 -0.0 0.0 -7.322" virial="-220.5185 0.0 0.0 0.0 -220.5185 -0.0 0.0 -0.0 -220.5185" pbc="T T T"
Pt       0.00000000       0.00000000       0.00000000      -0.00000000       0.00000100       0.00000200
Pt       0.00000000       1.55563000       1.55563000       0.00000100       0.00000000      -0.00000100
Pt       1.55563000       0.00000000       1.55563000       0.00000100      -0.00000100       0.00000100
Pt       1.55563000       1.55563000       0.00000000      -0.00000100      -0.00000000      -0.00000100
...

To be more illustrative, errors are show in figures, iTerm2 HS85rt error-jx iTerm2 4EhYKo error-jx

gabor1 commented 4 years ago

You nailed it, the conventions of stress are varying... I will amend the docs to draw attention to this. VASP uses a convention opposite to us, and the ASE reader just converts the units, but not the sign.

You didn't ask for it, but I have some comments on your hyperparameters. It is likely that your 3b delta is too large: remember it is per descriptor so that term applies to every triangle within your cutoff. There are a lot of them! More typical delta values are 0.002 and 0.001. Your SOAP can be more accurate and faster if you reduce the l_max to 3 and increase the n_max to 8, that would represent a good medium accuracy, for high accuracy use l_max=6, n_max=12. But this also interacts with your cutoff and atom_sigma. Your atom_sigma is quite large, I would drop it to 0.5. The cutoff is really minimal, it's fine for a crude model, but if you ever need accuracy, you should increase it to 5 or 5.5. Your default energy sigma being 0.004 is ok again for medium accuracy, and your actual training RMSE is 0.005, so basically the same , but your test RMSE is twice as large. I think that is because of your cutoff. If you want a better potential, increase the cutoff, and you can probably lower your energy sigma a bit more. your force sigma is can stay as it is if you increase the cutoff. with the present cutoff you are overfitting it a bit, your test force RMSE is 10x your training force RMSE, so if you don't want to increase the cutoff, but happy with a medium accuracy model, I would raise the force sigma to 0.1. your soap n_sparse is quite small - it would be the next thing to try to double to see if that is what's holding back the accuracy.

how is the virial test and train error (plot it per atom)

noambernstein commented 4 years ago

Coincidentally, I’m pretty sure I just ran into the same issue. The fundamental problem is that there are different sign conventions for virials and stresses. I’m pretty sure that the docs are just wrong, and ASE’s stresses have the opposite signs as the corresponding virials quip’s fitting expects.

Noam

On Fri, Aug 7, 2020 at 6:07 PM Jiayan Xu notifications@github.com wrote:

Dear libAtoms users and developers

Recently, I am working on the GAP training of the Pt bulk, where VASP serves as the DFT engine and ASE converts data.

As I follow the virial=stressvolume in doc https://libatoms.github.io/GAP/gap_fit.html#data, the training results is really terrible. However, when the negative values are used virial=-stressvolume, the error is very small. Thus, I am really confused with the sign of stress and virial used by ASE and gap_fit.

The training command used is gap_fit_omp at_file=train.xyz gp_file=GAP.xml do_copy_at_file=F sparse_separate_file=F energy_parameter_name=free_energy force_parameter_name=forces virial_parameter_name=virial default_sigma={0.004 0.04 0.04 0.0} gap={distance_2b cutoff=4.0 delta=0.8 theta_uniform=1.0 covariance_type=ard_se sparse_method=uniform n_sparse=100:angle_3b cutoff=4.0 delta=0.4 theta_uniform=1.0 covariance_type=ard_se sparse_method=uniform n_sparse=250:soap cutoff=4.0 delta=0.2 zeta=2 atom_sigma=0.7 l_max=6 n_max=6 covariance_type=dot_product sparse_method=cur_points n_sparse=250}

And the typical structures in train.xyz are

1 Lattice="10.0 0.0 0.0 0.0 10.0 0.0 0.0 0.0 10.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=0.0 free_energy=0.0 pbc="T T T" Pt 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 4 Lattice="3.1113 0.0 0.0 0.0 3.1113 0.0 0.0 0.0 3.1113" Properties=species:S:1:pos:R:3:forces:R:3 energy=34.0622 free_energy=34.0592 stress="-7.322 -0.0 -0.0 -0.0 -7.322 0.0 -0.0 0.0 -7.322" virial="-220.5185 0.0 0.0 0.0 -220.5185 -0.0 0.0 -0.0 -220.5185" pbc="T T T" Pt 0.00000000 0.00000000 0.00000000 -0.00000000 0.00000100 0.00000200 Pt 0.00000000 1.55563000 1.55563000 0.00000100 0.00000000 -0.00000100 Pt 1.55563000 0.00000000 1.55563000 0.00000100 -0.00000100 0.00000100 Pt 1.55563000 1.55563000 0.00000000 -0.00000100 -0.00000000 -0.00000100 ...

To be more illustrative, errors are show in figures, [image: iTerm2 HS85rt error-jx] https://user-images.githubusercontent.com/22236986/89692021-653ed500-d902-11ea-8918-d3d9f4530255.png [image: iTerm2 4EhYKo error-jx] https://user-images.githubusercontent.com/22236986/89692045-7daeef80-d902-11ea-8459-ba054588bb3c.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/libAtoms/QUIP/issues/227, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBMJRWRSCIZBDSUSIIYTMLR7R3K7ANCNFSM4PYBB6QQ .

gabor1 commented 4 years ago

What about other codes? Castep? If there is evidence that Quip is the one driving up the wrong way, we can change it...

-- Gábor

On 7 Aug 2020, at 23:34, noambernstein notifications@github.com wrote:

 Coincidentally, I’m pretty sure I just ran into the same issue. The fundamental problem is that there are different sign conventions for virials and stresses. I’m pretty sure that the docs are just wrong, and ASE’s stresses have the opposite signs as the corresponding virials quip’s fitting expects.

Noam

On Fri, Aug 7, 2020 at 6:07 PM Jiayan Xu notifications@github.com wrote:

Dear libAtoms users and developers

Recently, I am working on the GAP training of the Pt bulk, where VASP serves as the DFT engine and ASE converts data.

As I follow the virial=stressvolume in doc https://libatoms.github.io/GAP/gap_fit.html#data, the training results is really terrible. However, when the negative values are used virial=-stressvolume, the error is very small. Thus, I am really confused with the sign of stress and virial used by ASE and gap_fit.

The training command used is gap_fit_omp at_file=train.xyz gp_file=GAP.xml do_copy_at_file=F sparse_separate_file=F energy_parameter_name=free_energy force_parameter_name=forces virial_parameter_name=virial default_sigma={0.004 0.04 0.04 0.0} gap={distance_2b cutoff=4.0 delta=0.8 theta_uniform=1.0 covariance_type=ard_se sparse_method=uniform n_sparse=100:angle_3b cutoff=4.0 delta=0.4 theta_uniform=1.0 covariance_type=ard_se sparse_method=uniform n_sparse=250:soap cutoff=4.0 delta=0.2 zeta=2 atom_sigma=0.7 l_max=6 n_max=6 covariance_type=dot_product sparse_method=cur_points n_sparse=250}

And the typical structures in train.xyz are

1 Lattice="10.0 0.0 0.0 0.0 10.0 0.0 0.0 0.0 10.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=0.0 free_energy=0.0 pbc="T T T" Pt 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 4 Lattice="3.1113 0.0 0.0 0.0 3.1113 0.0 0.0 0.0 3.1113" Properties=species:S:1:pos:R:3:forces:R:3 energy=34.0622 free_energy=34.0592 stress="-7.322 -0.0 -0.0 -0.0 -7.322 0.0 -0.0 0.0 -7.322" virial="-220.5185 0.0 0.0 0.0 -220.5185 -0.0 0.0 -0.0 -220.5185" pbc="T T T" Pt 0.00000000 0.00000000 0.00000000 -0.00000000 0.00000100 0.00000200 Pt 0.00000000 1.55563000 1.55563000 0.00000100 0.00000000 -0.00000100 Pt 1.55563000 0.00000000 1.55563000 0.00000100 -0.00000100 0.00000100 Pt 1.55563000 1.55563000 0.00000000 -0.00000100 -0.00000000 -0.00000100 ...

To be more illustrative, errors are show in figures, [image: iTerm2 HS85rt error-jx] https://user-images.githubusercontent.com/22236986/89692021-653ed500-d902-11ea-8918-d3d9f4530255.png [image: iTerm2 4EhYKo error-jx] https://user-images.githubusercontent.com/22236986/89692045-7daeef80-d902-11ea-8459-ba054588bb3c.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/libAtoms/QUIP/issues/227, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBMJRWRSCIZBDSUSIIYTMLR7R3K7ANCNFSM4PYBB6QQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

hsulab commented 4 years ago

Many thanks! This Pt model is actually a practice helps me understand the workflow of GAP while the ongoing project is much complicated. The additional comments are really helpful, I will be much careful with the hyper-parameters in the following training.

You nailed it, the conventions of stress are varying... I will amend the docs to draw attention to this. VASP uses a convention opposite to us, and the ASE reader just converts the units, but not the sign.

You didn't ask for it, but I have some comments on your hyperparameters. It is likely that your 3b delta is too large: remember it is per descriptor so that term applies to every triangle within your cutoff. There are a lot of them! More typical delta values are 0.002 and 0.001. Your SOAP can be more accurate and faster if you reduce the l_max to 3 and increase the n_max to 8, that would represent a good medium accuracy, for high accuracy use l_max=6, n_max=12. But this also interacts with your cutoff and atom_sigma. Your atom_sigma is quite large, I would drop it to 0.5. The cutoff is really minimal, it's fine for a crude model, but if you ever need accuracy, you should increase it to 5 or 5.5. Your default energy sigma being 0.004 is ok again for medium accuracy, and your actual training RMSE is 0.005, so basically the same , but your test RMSE is twice as large. I think that is because of your cutoff. If you want a better potential, increase the cutoff, and you can probably lower your energy sigma a bit more. your force sigma is can stay as it is if you increase the cutoff. with the present cutoff you are overfitting it a bit, your test force RMSE is 10x your training force RMSE, so if you don't want to increase the cutoff, but happy with a medium accuracy model, I would raise the force sigma to 0.1. your soap n_sparse is quite small - it would be the next thing to try to double to see if that is what's holding back the accuracy.

how is the virial test and train error (plot it per atom)

gabor1 commented 4 years ago

Checking Wikipedia and the lammps docs, indeed they both say things that are equivalent to

Virial = -stress*volume

-- Gábor

On 7 Aug 2020, at 23:49, Jiayan Xu notifications@github.com wrote:

 Many thanks! This Pt model is actually a practice helps me understand the workflow of GAP while the ongoing project is much complicated. The additional comments are really helpful, I will be much careful with the hyper-parameters in the following training.

You nailed it, the conventions of stress are varying... I will amend the docs to draw attention to this. VASP uses a convention opposite to us, and the ASE reader just converts the units, but not the sign.

You didn't ask for it, but I have some comments on your hyperparameters. It is likely that your 3b delta is too large: remember it is per descriptor so that term applies to every triangle within your cutoff. There are a lot of them! More typical delta values are 0.002 and 0.001. Your SOAP can be more accurate and faster if you reduce the l_max to 3 and increase the n_max to 8, that would represent a good medium accuracy, for high accuracy use l_max=6, n_max=12. But this also interacts with your cutoff and atom_sigma. Your atom_sigma is quite large, I would drop it to 0.5. The cutoff is really minimal, it's fine for a crude model, but if you ever need accuracy, you should increase it to 5 or 5.5. Your default energy sigma being 0.004 is ok again for medium accuracy, and your actual training RMSE is 0.005, so basically the same , but your test RMSE is twice as large. I think that is because of your cutoff. If you want a better potential, increase the cutoff, and you can probably lower your energy sigma a bit more. your force sigma is can stay as it is if you increase the cutoff. with the present cutoff you are overfitting it a bit, your test force RMSE is 10x your training force RMSE, so if you don't want to increase the cutoff, but happy with a medium accuracy model, I would raise the force sigma to 0.1. your soap n_sparse is quite small - it would be the next thing to try to double to see if that is what's holding back the accuracy.

how is the virial test and train error (plot it per atom)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

mcaroba commented 4 years ago

I was recently dealing with these conventions for stresses and virials. This is a summary:

VASP uses the convention that compressive stresses are positive and tensile stresses are negative. This is contrary to the usual convention that stress and strain are connected via the (usually positive) stiffness tensor components (aka "elastic constants"). When ASE imports the VASP stress tensor it corrects the sign (to make it compliant with the usual convention) and also corrects the order on the Voigt components if one uses Voigt notation (but the Voigt form should NOT be used with GAP, you should use the full symmetric form with 9 components).

Assuming you are using the usual convention (tensile stress/strain is positive and compressive stress/strain is negative) as provided by the ASE interface, then the virial is connected to the stress tensor via the -volume multiplicative factor: virial = - stress * volume.

jameskermode commented 4 years ago

I agree with Miguel: virial = -stress x volume is the convention, and the ASE <—> quippy Interface respects this. ASE correctly parses VASP and Castep and produces stresses in line with this convention, but for training a GAP it’s up to the user to add a virial parameter to Atoms.info. Extending QUIP to also read stress in the ASE convention would perhaps be a good step forward.

mcaroba commented 4 years ago

I think the least confusing option would be to pass the stress, rather than the virial, especially since extended XYZ files produced by ASE contain this info by default if the original calculation (e.g., VASP's OUTCAR) had a stress tensor attached. The interface could also handle Voigt notation automatically, and internally convert to virial. Then user intervention would not be required, minimizing the potential for errors. I don't know how easy this would be to implement in practice.

jameskermode commented 4 years ago

What do we do if both stress and virial are present in the xyz? Raise an error if they are inconsistent?

mcaroba commented 4 years ago

That sounds like a good sanity check, no?

gabor1 commented 4 years ago

One of the reasons we didn't do this originally is because the consistent stress units (eV/A^3) are not used by anyone anywhere. So then you have to know whether the stress is in GPa, kbar, bar, mmHg or what.

-- Gábor

On 8 Aug 2020, at 09:44, Miguel A. Caro notifications@github.com wrote:

 That sounds like a good sanity check, no?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

jameskermode commented 4 years ago

ASE at least does use eV/A^3

gabor1 commented 4 years ago

Ok, so in that case we can just use ase exported stresses without any unit conversion

-- Gábor

On 8 Aug 2020, at 10:06, James Kermode notifications@github.com wrote:

 ASE at least does use eV/A^3

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

hsulab commented 4 years ago

Thanks for all of your reply!

The stress calculated by the current model is great, rmse less than 0.001 (eV AA-3 per atom). And the gap optimized lattice by ASE-QUIPPY StrainFilter is in a very good agreement with DFT one, 0.01 AA smaller.

I think Miguel's suggestion is useful. Following the ASE convention can reduce some unexpected errors. Maybe a warning can be raised when stress and virial are detected inconsistently. And the update of documentation and tutorial can be much helpful.

bernstei commented 4 years ago

I think it would be nice of QUIP gap fitting was able to use stresses directly, converting to virials internally (assuming the ASE sign convention).

gabor1 commented 4 years ago

This will be added in an upcoming version of GAP.