Closed j-wags closed 4 years ago
Relevant comment from other issue:
On the issue of ASN dihedral energy discrepancies, flipping the locations of the two side chain amino hydrogens in the .mol2 file (the structure which feeds into the OFF energy evaluation) has no effect on the significant (0.0480 kJ/mol) deviation between OFF and Amber results. Flipping the locations in both structures likewise has no effect.
Doing a more detailed analysis using @dscerutti's SortResults.py
highlights additional structures where individual energy components differ.
(off-dev) jeffreywagner@JW-MBP$ grep -A 1 -B 3 "+++" 2020_09_01_SortResult_output1
ASN_ASP
Bond: 6.5135 6.5135 -0.0000
Angl: 12.2256 12.2257 -0.0001
Dihe: 147.8360 148.0036 -0.1676 +++
Nonb: -552.8002 -552.8013 0.0010
--
--
ASN_GLU
Bond: 5.8272 5.8272 0.0000
Angl: 11.3824 11.3824 -0.0000
Dihe: 147.6373 147.7931 -0.1558 +++
Nonb: -484.2989 -484.2998 0.0009
--
--
ASN_GLY
Bond: 4.5501 4.5501 0.0000
Angl: 8.6737 8.6738 -0.0000
Dihe: 98.5796 98.5868 -0.0072 +++
Nonb: -505.8692 -505.8701 0.0008
--
--
ASN_HIE
Bond: 5.9261 5.9261 0.0000
Angl: 88.2863 88.2866 -0.0002
Dihe: 131.4775 131.5318 -0.0543 +++
Nonb: -565.0930 -565.0939 0.0009
--
--
ASN_LYS
Bond: 6.0608 6.0608 0.0000
Angl: 12.6164 12.6164 0.0000
Dihe: 123.3779 123.3844 -0.0065 +++
Nonb: -345.9822 -345.9836 0.0013
--
--
ASN_PRO
Bond: 4.9800 4.9800 0.0000
Angl: 25.5237 25.5238 -0.0001
Dihe: 151.3126 151.3377 -0.0251 +++
Nonb: -487.3120 -487.3126 0.0006
--
--
ASN_THR
Bond: 5.8591 5.8591 -0.0000
Angl: 11.2219 11.2219 -0.0000
Dihe: 141.5504 141.6153 -0.0649 +++
Nonb: -604.9189 -604.9188 -0.0001
--
--
GLN_PRO
Bond: 5.1209 5.1209 0.0000
Angl: 26.0500 26.0501 -0.0001
Dihe: 148.3480 148.3540 -0.0060 +++
Nonb: -410.7910 -410.7915 0.0006
--
--
PRO_ASN
Bond: 4.2244 4.2244 0.0000
Angl: 24.9894 24.9894 -0.0000
Dihe: 144.4091 144.4143 -0.0052 +++
Nonb: -461.7927 -461.7937 0.0010
Switching the order of the tagged hydrogens in the sidechain N improper for ASN/GLN seems to resolve this issue.
Before:
<Improper smirks="[H][C@@]([C]=O)(C([H])([H])C([H])([H])[C:1](=O)[N:3]([H:2])[H:4])[N][H]" periodicity1="2" phase1="180.0 * degree" k1="1.0 * mole**-1 * kilocalorie" id="A14SB-MainChain-GLN-C_H_N_H" idivf1="1.0"></Improper>
After:
<Improper smirks="[H][C@@]([C]=O)(C([H])([H])C([H])([H])[C:1](=O)[N:3]([H:4])[H:2])[N][H]" periodicity1="2" phase1="180.0 * degree" k1="1.0 * mole**-1 * kilocalorie" id="A14SB-MainChain-GLN-C_H_N_H" idivf1="1.0"></Improper>
It's unlikely that we can solve this in the general case, since these hydrogens are indistinguishable in a graph molecule (unless bond stereochemistry is assigned to the C-N bond, which might be a direction to look in the future).
(off-dev) jeffreywagner@JW-MBP$ python SortResult.py 2020_09_01_output_asn_gln_discrepancy1.clean
Main Chain
Type Amber OFF Diff. Significant?
ASN_ASP
Bond: 6.5758 6.5758 0.0000
Angl: 12.2873 12.2874 -0.0001
Dihe: 148.2591 148.2593 -0.0002
Nonb: -553.4116 -553.4128 0.0012
ASN_GLU
Bond: 5.7722 5.7722 0.0000
Angl: 11.3831 11.3831 -0.0000
Dihe: 147.6311 147.6313 -0.0002
Nonb: -484.2434 -484.2444 0.0010
ASN_GLY
Bond: 4.5193 4.5193 0.0000
Angl: 8.7029 8.7030 -0.0000
Dihe: 98.4893 98.4894 -0.0001
Nonb: -505.7188 -505.7196 0.0008
ASN_HIE
Bond: 5.8931 5.8931 0.0000
Angl: 88.4333 88.4336 -0.0002
Dihe: 131.0322 131.0330 -0.0008
Nonb: -564.1450 -564.1458 0.0009
ASN_LYS
Bond: 6.6866 6.6866 0.0000
Angl: 13.3347 13.3347 -0.0000
Dihe: 123.4211 123.4216 -0.0005
Nonb: -346.9043 -346.9056 0.0013
ASN_PRO
Bond: 4.9945 4.9945 -0.0000
Angl: 25.4940 25.4941 -0.0001
Dihe: 151.2304 151.2335 -0.0031
Nonb: -487.2285 -487.2290 0.0005
ASN_THR
Bond: 5.8824 5.8824 0.0000
Angl: 11.0472 11.0472 -0.0000
Dihe: 141.7485 141.7486 -0.0001
Nonb: -605.2682 -605.2683 0.0001
GLN_PRO
Bond: 4.9920 4.9920 0.0000
Angl: 26.1492 26.1492 -0.0001
Dihe: 146.3806 146.3838 -0.0032
Nonb: -407.5635 -407.5640 0.0005
PRO_ASN
Bond: 4.1971 4.1971 0.0000
Angl: 24.9287 24.9288 -0.0000
Dihe: 144.5038 144.5072 -0.0033
Nonb: -461.9353 -461.9362 0.0009
N-Terminal
Type Amber OFF Diff. Significant?
C-Terminal
Type Amber OFF Diff. Significant?
I think this can be closed. We can anticipate energy discrepancies due to H ordering in a random subset of ASN/GLN sidechains on the order of ~0.01 kcal/mol, but this is basically unavoidable without knowing atom names during parameterization.
A search for all energy discrepancies in MainChain tripeptide parameterization yields the following, having an (AMBER-OpenFF) energy delta >0.01 kJ/mol
Details of each:
ASN_ASP
ASN_GLU
ASN_HIE
ASN_PRO
ASN_THR