openforcefield / amber-ff-porting

Scratch space for porting amber FFs into SMIRNOFF format
1 stars 3 forks source link

Fix improper energies #16

Closed j-wags closed 3 years ago

j-wags commented 4 years ago

For reference, the worst case of improper mismatch is NTerminal GLU

Before this PR:

NTerminal/GLU/GLU
HarmonicBondForce 2.1379218101501465 kJ/mol
HarmonicAngleForce 6.882795810699463 kJ/mol
PeriodicTorsionForce 46.8884162902832 kJ/mol
NonbondedForce -415.76202392578125 kJ/mol
-359.8529052734375 kJ/mol
[H][C@@](C(=O)N([H])C([H])([H])[H])(C([H])([H])C([H])([H])C(=O)[O-])[N+]([H])([H])[H]
HarmonicAngleForce 6.882845401763916 kJ/mol
HarmonicBondForce 2.1379215717315674 kJ/mol
NonbondedForce -415.7628173828125 kJ/mol
PeriodicTorsionForce 28.416242599487305 kJ/mol
-378.3258056640625 kJ/mol
67 amber dihedrals (3 impropers) and 73 off dihedrals
j-wags commented 3 years ago

These fixes show pretty systematic improvement. NTerm GLU now:

NTerminal/GLU/GLU
HarmonicBondForce 2.1379218101501465 kJ/mol
HarmonicAngleForce 6.882795810699463 kJ/mol
PeriodicTorsionForce 46.8884162902832 kJ/mol
NonbondedForce -415.76202392578125 kJ/mol
-359.8529052734375 kJ/mol
[H][C@@](C(=O)N([H])C([H])([H])[H])(C([H])([H])C([H])([H])C(=O)[O-])[N+]([H])([H])[H]
HarmonicAngleForce 6.882845401763916 kJ/mol
HarmonicBondForce 2.1379215717315674 kJ/mol
NonbondedForce -415.7628173828125 kJ/mol
PeriodicTorsionForce 48.01765441894531 kJ/mol
-358.7243957519531 kJ/mol
67 amber dihedrals (3 impropers) and 73 off dihedrals
j-wags commented 3 years ago

Preventing reorderings of the matches by substituting ImproperDict with dict in ImproperTorsionHandler has some weird effects. Note that I DON'T specify threefold SMIRKS here -- I only should be assigning one improper per center. I'll need to look into this more.

Screen Shot 2020-08-05 at 6 24 40 PM
j-wags commented 3 years ago

Changed the OFF Toolkit to NOT apply the threefold torsion (which it was doing previously, even if the OFFXML was written with only one ordering of the atoms). Results looking much better now:

NTerminal/GLU/GLU
HarmonicBondForce 2.1379218101501465 kJ/mol
HarmonicAngleForce 6.882795810699463 kJ/mol
PeriodicTorsionForce 46.8884162902832 kJ/mol
NonbondedForce -415.76202392578125 kJ/mol
-359.8529052734375 kJ/mol
[H][C@@](C(=O)N([H])C([H])([H])[H])(C([H])([H])C([H])([H])C(=O)[O-])[N+]([H])([H])[H]
HarmonicAngleForce 6.882845401763916 kJ/mol
HarmonicBondForce 2.1379215717315674 kJ/mol
NonbondedForce -415.7628173828125 kJ/mol
PeriodicTorsionForce 46.88935089111328 kJ/mol
-359.8526916503906 kJ/mol
67 amber dihedrals (3 impropers) and 67 off dihedrals
j-wags commented 3 years ago

90% of improper counts match now, but I'm still seeing a small number of discrepancies like below. All of the mismatches occur when there are five or more impropers. Maybe it has to do with symmetric groups like carboxylates and guanidiniums.

MainChain/ARG/ARG
HarmonicBondForce 8.432280540466309 kJ/mol
HarmonicAngleForce 5.6592864990234375 kJ/mol
PeriodicTorsionForce 43.686195373535156 kJ/mol
NonbondedForce -668.3820190429688 kJ/mol
-610.6043090820312 kJ/mol
[H][C@@](C(=O)N([H])C([H])([H])[H])(C([H])([H])C([H])([H])C([H])([H])[N+](=C(N([H])[H])N([H])[H])[H])N([H])C(=O)C([H])([H])[H]
HarmonicAngleForce 5.659282684326172 kJ/mol
HarmonicBondForce 8.432280540466309 kJ/mol
NonbondedForce -668.378173828125 kJ/mol
PeriodicTorsionForce 43.68626403808594 kJ/mol
-610.6004028320312 kJ/mol
122 amber dihedrals (8 impropers) and 125 off dihedrals
j-wags commented 3 years ago

Ok, I'm now deduplicating matches based on their central atom, and this looks even better. I think this level of discrepancy is acceptable. The big question now is how to package the numerous changes needed to the OFF Toolkit to enable this behavior. I think the shortest path will be to make an external AmberImproperTorsionParameterHandler, with the modified behavior, and store that in this repo. We'll also need to change the tag on the OFFXMLs created here to say AmberImproperTorsions so they get routed to the right toolkit behavior.

MainChain/ARG/ARG
HarmonicBondForce 8.432280540466309 kJ/mol
HarmonicAngleForce 5.6592864990234375 kJ/mol
PeriodicTorsionForce 43.686195373535156 kJ/mol
NonbondedForce -668.3820190429688 kJ/mol
-610.6043090820312 kJ/mol
[H][C@@](C(=O)N([H])C([H])([H])[H])(C([H])([H])C([H])([H])C([H])([H])[N+](=C(N([H])[H])N([H])[H])[H])N([H])C(=O)C([H])([H])[H]
HarmonicAngleForce 5.659282684326172 kJ/mol
HarmonicBondForce 8.432280540466309 kJ/mol
NonbondedForce -668.378173828125 kJ/mol
PeriodicTorsionForce 43.68622589111328 kJ/mol
-610.600341796875 kJ/mol
122 amber dihedrals (8 impropers) and 122 off dihedrals

And this looks good for the "challenging" Nterm GLU case

NTerminal/GLU/GLU
HarmonicBondForce 2.1379218101501465 kJ/mol
HarmonicAngleForce 6.882795810699463 kJ/mol
PeriodicTorsionForce 46.8884162902832 kJ/mol
NonbondedForce -415.76202392578125 kJ/mol
-359.8529052734375 kJ/mol
[H][C@@](C(=O)N([H])C([H])([H])[H])(C([H])([H])C([H])([H])C(=O)[O-])[N+]([H])([H])[H]
HarmonicAngleForce 6.882845401763916 kJ/mol
HarmonicBondForce 2.1379215717315674 kJ/mol
NonbondedForce -415.7628173828125 kJ/mol
PeriodicTorsionForce 46.889320373535156 kJ/mol
-359.85272216796875 kJ/mol
67 amber dihedrals (3 impropers) and 67 off dihedrals
dscerutti commented 3 years ago

This looks really good! The major discrepancy is in the nonbonded interactions, and even that is no longer so big. The impropers, if now consistent and correct throughout, will give a much better reproduction of the true Amber energy surface.

j-wags commented 3 years ago

@dscerutti This should be ready for review + merge if you have time this week. Let me know what you think of the new results if you pull this branch and rerun the comparison. If you're pulling the files one-by-one, make sure to get the new amberimpropertorsionhandler.py file.

I'm off this week, but can make edits and work with you on next steps the week of Aug 24.

dscerutti commented 3 years ago

Still a minor problem in N-terminal LEU, the non-bonded interactions are off by -0.0137 kJ/mol. There is another issue in C-terminal ASN, where the dihedral energies still differ by -0.0584 kJ/mol. But, all of the other interactions match to within 0.01 kJ/mol. Tightening the tolerance, main-chain ASN dihedral energy differs by -0.0078 kJ/mol, and N-terminal PRO dihedrals differ by -0.0059 kJ/mol.

But for the C-terminal ASN, I think that the rest of these differences are probably forgivable in the space of proteins, but for more general applicability we'd better track down what's happening with N-terminal LEU (the lone non-bonded energy violation) and see if perhaps there is something more pressing to uncover in ASN generally. There is a planar group on the side-chain that involves impropers. It might be good to try twisting some structures to exaggerate these and other impropers, because the equilibrium state of all these things is going to be near zero, and saying "zero" does not necessarily mean that we understand the details of the potential energy term as they should be.

I will make a "challenge" data set with more interesting, strained conformations to see if our new scripts can get those. Otherwise, this pull request looks like great progress.

dscerutti commented 3 years ago

I am doing some additional tests on the main chain ASN. In its well optimized state, there is a discrepancy of 0.0078 kcal/mol between the Amber and OpenFF representations (I am taking the sign of that discrepancy to be irrelevant). If I try to twist the peptide nitrogen on the backbone out of planarity, that is make the C-N(-H)-CA atoms pucker, I do not see the energy difference between the two representations grow. Nor does it grow when I pucker the CB-CG(-OD1)-ND2 planar group. However, when I pucker the CG-ND2(-HD12)-HD22 group, the discrepancy grows 70-fold from a trivial 0.0078 kcal/mol to almost 0.5 kcal/mol. That's the issue, something in the CG-ND2(-HD12)-HD22 impropers.

Same exaggeration of the difference, exposure of the problem, happens in GLN when I twist the amide group on that side chain. This is likely the problem with the C-terminal ASN as well.