openforcefield / amber-ff-porting

Scratch space for porting amber FFs into SMIRNOFF format
1 stars 3 forks source link

ASN/GLN energy discrepancies #23

Closed j-wags closed 4 years ago

j-wags commented 4 years ago

A search for all energy discrepancies in MainChain tripeptide parameterization yields the following, having an (AMBER-OpenFF) energy delta >0.01 kJ/mol

ASN_ASP: -0.166534423828125
ASN_GLU: -0.1549072265625
ASN_HIE: -0.05377197265625
ASN_PRO: -0.0245361328125
ASN_THR: -0.065093994140625

Details of each:

ASN_ASP

MainChain/ASN_ASP/ASN_ASP
HarmonicBondForce 6.513474464416504 kJ/mol
HarmonicAngleForce 12.22559928894043 kJ/mol
PeriodicTorsionForce 147.83604431152344 kJ/mol
NonbondedForce -552.8002319335938 kJ/mol
-386.22509765625 kJ/mol
[H][C@@](C(=O)N([H])[C@]([H])(C(=O)N([H])C([H])([H])[H])C([H])([H])C(=O)[O-])(C([H])([H])C(=O)N([H])[H])N([H])C(=O)C([H])([H])[H]
HarmonicAngleForce 12.225651741027832 kJ/mol
HarmonicBondForce 6.513474941253662 kJ/mol
NonbondedForce -552.80126953125 kJ/mol
PeriodicTorsionForce 148.00360107421875 kJ/mol
-386.0585632324219 kJ/mol
155 amber dihedrals (9 impropers) and 155 off dihedrals

ASN_GLU

MainChain/ASN_GLU/ASN_GLU
HarmonicBondForce 5.827162742614746 kJ/mol
HarmonicAngleForce 11.382368087768555 kJ/mol
PeriodicTorsionForce 147.63726806640625 kJ/mol
NonbondedForce -484.2988586425781 kJ/mol
-319.45208740234375 kJ/mol
[H][C@@](C(=O)N([H])[C@]([H])(C(=O)N([H])C([H])([H])[H])C([H])([H])C([H])([H])C(=O)[O-])(C([H])([H])C(=O)N([H])[H])N([H])C(=O)C([$
HarmonicAngleForce 11.38240909576416 kJ/mol
HarmonicBondForce 5.827162265777588 kJ/mol
NonbondedForce -484.2998046875 kJ/mol
PeriodicTorsionForce 147.79307556152344 kJ/mol
-319.29718017578125 kJ/mol
163 amber dihedrals (9 impropers) and 163 off dihedrals

ASN_HIE

MainChain/ASN_HIE/ASN_HIE
HarmonicBondForce 5.926081657409668 kJ/mol
HarmonicAngleForce 88.28634643554688 kJ/mol
PeriodicTorsionForce 131.4774932861328 kJ/mol
NonbondedForce -565.093017578125 kJ/mol
-339.4031677246094 kJ/mol
[H]C1=C(N=C(N1[H])[H])C([H])([H])[C@@]([H])(C(=O)N([H])C([H])([H])[H])N([H])C(=O)[C@]([H])(C([H])([H])C(=O)N([H])[H])N([H])C(=O)C$
HarmonicAngleForce 88.28657531738281 kJ/mol
HarmonicBondForce 5.926081657409668 kJ/mol
NonbondedForce -565.0938720703125 kJ/mol
PeriodicTorsionForce 131.5317840576172 kJ/mol
-339.3493957519531 kJ/mol
171 amber dihedrals (12 impropers) and 171 off dihedrals

ASN_PRO

MainChain/ASN_PRO/ASN_PRO
HarmonicBondForce 4.980006217956543 kJ/mol
HarmonicAngleForce 25.523672103881836 kJ/mol
PeriodicTorsionForce 151.31263732910156 kJ/mol
NonbondedForce -487.31195068359375 kJ/mol
-305.49566650390625 kJ/mol
[H][C@]1(C(C(C(N1C(=O)[C@]([H])(C([H])([H])C(=O)N([H])[H])N([H])C(=O)C([H])([H])[H])([H])[H])([H])[H])([H])[H])C(=O)N([H])C([H])($
HarmonicAngleForce 25.523752212524414 kJ/mol
HarmonicBondForce 4.980005741119385 kJ/mol
NonbondedForce -487.3125915527344 kJ/mol
PeriodicTorsionForce 151.33773803710938 kJ/mol
-305.47113037109375 kJ/mol
164 amber dihedrals (8 impropers) and 164 off dihedrals

ASN_THR

MainChain/ASN_THR/ASN_THR
HarmonicBondForce 5.859104156494141 kJ/mol
HarmonicAngleForce 11.221858024597168 kJ/mol
PeriodicTorsionForce 141.55039978027344 kJ/mol
NonbondedForce -604.9189453125 kJ/mol
-446.28759765625 kJ/mol
[H][C@@](C(=O)N([H])C([H])([H])[H])([C@@]([H])(C([H])([H])[H])O[H])N([H])C(=O)[C@]([H])(C([H])([H])C(=O)N([H])[H])N([H])C(=O)C([H$
HarmonicAngleForce 11.221903800964355 kJ/mol
HarmonicBondForce 5.859104633331299 kJ/mol
NonbondedForce -604.9188232421875 kJ/mol
PeriodicTorsionForce 141.61532592773438 kJ/mol
-446.2225036621094 kJ/mol
170 amber dihedrals (8 impropers) and 170 off dihedrals
j-wags commented 4 years ago

Relevant comment from other issue:

On the issue of ASN dihedral energy discrepancies, flipping the locations of the two side chain amino hydrogens in the .mol2 file (the structure which feeds into the OFF energy evaluation) has no effect on the significant (0.0480 kJ/mol) deviation between OFF and Amber results. Flipping the locations in both structures likewise has no effect.

j-wags commented 4 years ago

Doing a more detailed analysis using @dscerutti's SortResults.py highlights additional structures where individual energy components differ.

(off-dev) jeffreywagner@JW-MBP$ grep -A 1 -B 3  "+++" 2020_09_01_SortResult_output1 
ASN_ASP
  Bond:     6.5135    6.5135   -0.0000       
  Angl:    12.2256   12.2257   -0.0001    
  Dihe:   147.8360  148.0036   -0.1676    +++
  Nonb:  -552.8002 -552.8013    0.0010    
--
--
ASN_GLU
  Bond:     5.8272    5.8272    0.0000       
  Angl:    11.3824   11.3824   -0.0000    
  Dihe:   147.6373  147.7931   -0.1558    +++
  Nonb:  -484.2989 -484.2998    0.0009    
--
--
ASN_GLY
  Bond:     4.5501    4.5501    0.0000       
  Angl:     8.6737    8.6738   -0.0000    
  Dihe:    98.5796   98.5868   -0.0072    +++
  Nonb:  -505.8692 -505.8701    0.0008    
--
--
ASN_HIE
  Bond:     5.9261    5.9261    0.0000       
  Angl:    88.2863   88.2866   -0.0002    
  Dihe:   131.4775  131.5318   -0.0543    +++
  Nonb:  -565.0930 -565.0939    0.0009    
--
--
ASN_LYS
  Bond:     6.0608    6.0608    0.0000       
  Angl:    12.6164   12.6164    0.0000    
  Dihe:   123.3779  123.3844   -0.0065    +++
  Nonb:  -345.9822 -345.9836    0.0013    
--
--
ASN_PRO
  Bond:     4.9800    4.9800    0.0000       
  Angl:    25.5237   25.5238   -0.0001    
  Dihe:   151.3126  151.3377   -0.0251    +++
  Nonb:  -487.3120 -487.3126    0.0006    
--
--
ASN_THR
  Bond:     5.8591    5.8591   -0.0000       
  Angl:    11.2219   11.2219   -0.0000    
  Dihe:   141.5504  141.6153   -0.0649    +++
  Nonb:  -604.9189 -604.9188   -0.0001    
--
--
GLN_PRO
  Bond:     5.1209    5.1209    0.0000       
  Angl:    26.0500   26.0501   -0.0001    
  Dihe:   148.3480  148.3540   -0.0060    +++
  Nonb:  -410.7910 -410.7915    0.0006    
--
--
PRO_ASN
  Bond:     4.2244    4.2244    0.0000       
  Angl:    24.9894   24.9894   -0.0000    
  Dihe:   144.4091  144.4143   -0.0052    +++
  Nonb:  -461.7927 -461.7937    0.0010    
j-wags commented 4 years ago

Switching the order of the tagged hydrogens in the sidechain N improper for ASN/GLN seems to resolve this issue.

Before:

<Improper smirks="[H][C@@]([C]=O)(C([H])([H])C([H])([H])[C:1](=O)[N:3]([H:2])[H:4])[N][H]" periodicity1="2" phase1="180.0 * degree" k1="1.0 * mole**-1 * kilocalorie" id="A14SB-MainChain-GLN-C_H_N_H" idivf1="1.0"></Improper>

After:

<Improper smirks="[H][C@@]([C]=O)(C([H])([H])C([H])([H])[C:1](=O)[N:3]([H:4])[H:2])[N][H]" periodicity1="2" phase1="180.0 * degree" k1="1.0 * mole**-1 * kilocalorie" id="A14SB-MainChain-GLN-C_H_N_H" idivf1="1.0"></Improper>

It's unlikely that we can solve this in the general case, since these hydrogens are indistinguishable in a graph molecule (unless bond stereochemistry is assigned to the C-N bond, which might be a direction to look in the future).

Results of testing with hydrogen-rearranged FF

(off-dev) jeffreywagner@JW-MBP$ python SortResult.py 2020_09_01_output_asn_gln_discrepancy1.clean 
Main Chain
  Type     Amber      OFF      Diff.   Significant?
ASN_ASP
  Bond:     6.5758    6.5758    0.0000       
  Angl:    12.2873   12.2874   -0.0001    
  Dihe:   148.2591  148.2593   -0.0002    
  Nonb:  -553.4116 -553.4128    0.0012    
ASN_GLU
  Bond:     5.7722    5.7722    0.0000       
  Angl:    11.3831   11.3831   -0.0000    
  Dihe:   147.6311  147.6313   -0.0002    
  Nonb:  -484.2434 -484.2444    0.0010    
ASN_GLY
  Bond:     4.5193    4.5193    0.0000       
  Angl:     8.7029    8.7030   -0.0000    
  Dihe:    98.4893   98.4894   -0.0001    
  Nonb:  -505.7188 -505.7196    0.0008    
ASN_HIE
  Bond:     5.8931    5.8931    0.0000       
  Angl:    88.4333   88.4336   -0.0002    
  Dihe:   131.0322  131.0330   -0.0008    
  Nonb:  -564.1450 -564.1458    0.0009    
ASN_LYS
  Bond:     6.6866    6.6866    0.0000       
  Angl:    13.3347   13.3347   -0.0000    
  Dihe:   123.4211  123.4216   -0.0005    
  Nonb:  -346.9043 -346.9056    0.0013    
ASN_PRO
  Bond:     4.9945    4.9945   -0.0000       
  Angl:    25.4940   25.4941   -0.0001    
  Dihe:   151.2304  151.2335   -0.0031    
  Nonb:  -487.2285 -487.2290    0.0005    
ASN_THR
  Bond:     5.8824    5.8824    0.0000       
  Angl:    11.0472   11.0472   -0.0000    
  Dihe:   141.7485  141.7486   -0.0001    
  Nonb:  -605.2682 -605.2683    0.0001    
GLN_PRO
  Bond:     4.9920    4.9920    0.0000       
  Angl:    26.1492   26.1492   -0.0001    
  Dihe:   146.3806  146.3838   -0.0032    
  Nonb:  -407.5635 -407.5640    0.0005    
PRO_ASN
  Bond:     4.1971    4.1971    0.0000       
  Angl:    24.9287   24.9288   -0.0000    
  Dihe:   144.5038  144.5072   -0.0033    
  Nonb:  -461.9353 -461.9362    0.0009    

N-Terminal
  Type     Amber      OFF      Diff.   Significant?

C-Terminal
  Type     Amber      OFF      Diff.   Significant?
j-wags commented 4 years ago

I think this can be closed. We can anticipate energy discrepancies due to H ordering in a random subset of ASN/GLN sidechains on the order of ~0.01 kcal/mol, but this is basically unavoidable without knowing atom names during parameterization.