openforcefield / amber-ff-porting

Scratch space for porting amber FFs into SMIRNOFF format
1 stars 3 forks source link

Remaining improper energy differences #18

Closed j-wags closed 3 years ago

j-wags commented 3 years ago

Still a minor problem in N-terminal LEU, the non-bonded interactions are off by -0.0137 kJ/mol. There is another issue in C-terminal ASN, where the dihedral energies still differ by -0.0584 kJ/mol. But, all of the other interactions match to within 0.01 kJ/mol. Tightening the tolerance, main-chain ASN dihedral energy differs by -0.0078 kJ/mol, and N-terminal PRO dihedrals differ by -0.0059 kJ/mol.

But for the C-terminal ASN, I think that the rest of these differences are probably forgivable in the space of proteins, but for more general applicability we'd better track down what's happening with N-terminal LEU (the lone non-bonded energy violation) and see if perhaps there is something more pressing to uncover in ASN generally. There is a planar group on the side-chain that involves impropers. It might be good to try twisting some structures to exaggerate these and other impropers, because the equilibrium state of all these things is going to be near zero, and saying "zero" does not necessarily mean that we understand the details of the potential energy term as they should be.

I will make a "challenge" data set with more interesting, strained conformations to see if our new scripts can get those. Otherwise, this pull request looks like great progress.

Originally posted by @dscerutti in https://github.com/openforcefield/amber-ff-porting/pull/16#issuecomment-675759352

j-wags commented 3 years ago

Also see: https://github.com/openforcefield/amber-ff-porting/pull/16#issuecomment-676557790

I am doing some additional tests on the main chain ASN. In its well optimized state, there is a discrepancy of 0.0078 kcal/mol between the Amber and OpenFF representations (I am taking the sign of that discrepancy to be irrelevant). If I try to twist the peptide nitrogen on the backbone out of planarity, that is make the C-N(-H)-CA atoms pucker, I do not see the energy difference between the two representations grow. Nor does it grow when I pucker the CB-CG(-OD1)-ND2 planar group. However, when I pucker the CG-ND2(-HD12)-HD22 group, the discrepancy grows 70-fold from a trivial 0.0078 kcal/mol to almost 0.5 kcal/mol. That's the issue, something in the CG-ND2(-HD12)-HD22 impropers.

Same exaggeration of the difference, exposure of the problem, happens in GLN when I twist the amide group on that side chain. This is likely the problem with the C-terminal ASN as well.

j-wags commented 3 years ago

Remaining problems with > 1e-2 kJ/mol discrepancy

ASN sidechain HNs DO have the same parameters (charge/vdW) in prmtop.

Difference might be due to OFF ordering HNs arbitrarily, while AMBER evaluates improper angle with them ordered by name.

Validation experiment for this would be to switch H positions in one of the engines and see if energies match again.

j-wags commented 3 years ago

@dscerutti found that NTerm LEU discrepancy seems to be because CD1 and CD2 have different charges assigned.

dscerutti commented 3 years ago

Did some investigation into the N-terminal LEU. Manually equalizing the charges in the OFF-XML file reduces the discrepancy by half, and manually moving either of the CD1 or CD2 atoms on the LEU side chain causes significant changes in the discrepancy (it gets worse if the atoms are moved a great distance, ignoring deleterious effects that these manipulations have on the valence energies). Moving any of the hydrogen atoms, in contrast, has no effect on the non-bonded energy discrepancy, and if any of the above manipulations are done by incremental amounts (half an Angstrom or less), all of the other energy terms remain in agreement. Manually changing both the OFF-XML and the Amber prmtop to place charges of -0.4105 on both CD1 and CD2, as opposed to -0.4104 and -0.4106, leads to strong agreement in the energy.

It is therefore demonstrated that the non-bonded energy difference we are seeing owes entirely to electrostatics, specifically to the incongruent charges on N-terminal LEU's two side chain methyl carbons. I suggest that we equalize the charge by some ad-hoc fix in ConvertParameters.py and accept the reduced discrepancy relative to Amber with a more correct depiction in the OFF product.

On the issue of ASN dihedral energy discrepancies, flipping the locations of the two side chain amino hydrogens in the .mol2 file (the structure which feeds into the OFF energy evaluation) has no effect on the significant (0.0480 kJ/mol) deviation between OFF and Amber results. Flipping the locations in both structures likewise has no effect.

dscerutti commented 3 years ago

Checked the PRO dihedrals, perturbations of at least one other potential hot spot don't seem to have any deleterious effect on the agreement. The overall agreement is still pretty good (0.001 kcal/mol), so that one gets a pass.

j-wags commented 3 years ago

Each of the individual cases in this Issue have been split out into separate discussions, so this issue can be closed.