Open chapincavender opened 2 years ago
To add to the list, similar issue with Nitro groups, C1=CC=C(C=C1)[N+](=O)[O-]
, t81 and t83 get assigned to the two oxygens.
SMILES [H:10][c:1]1[c:2]([c:3]([c:4]([c:5]([c:6]1[H:14])[H:13])[N+:7](=[O:8])[O-:9])[H:12])[H:11]
3 4 7 8 t83 [*:1]=[#7X2,#7X3+1:2]-[#6X3:3]=,:[*:4]
5 4 7 8 t83 [*:1]=[#7X2,#7X3+1:2]-[#6X3:3]=,:[*:4]
3 4 7 9 t81 [#8X1:1]~[#7X3:2]~[#6X3:3]~[*:4]
5 4 7 9 t81 [#8X1:1]~[#7X3:2]~[#6X3:3]~[*:4]
Whatever's going on, I can verify it's not due to recent changes in the toolkit. The reproduction behaves the same back to 0.9.0 and modifying it to the openforcefield
imports allows going as far back as Sage can be loaded. All versions assign t17
and t18
to the carboxylate like above.
@mattwthompson Yeah, it's not toolkit related, it's from the smarts definition of applied parameters distinguishing single and double bonds.
Copying from slack discussion: introducing new parameters t18a, t83a, t87a, which would explicitly match these cases can be a potential workaround.
<Proper smirks="[*:1]-[#6X4:2]-[#6X3:3](~[#8X1])~[#8X1:4]" periodicity1="2" phase1="0.0 * degree" id="t18a" k1="-0.3703352413219 * mole**-1 * kilocalorie" idivf1="1.0"></Proper>
<Proper smirks="[#8X1:1]~[#7X3:2](~[#8X1])-[#6:3]~[*:4]" periodicity1="2" phase1="180.0 * degree" id="t83a" k1="1.550515418264 * mole**-1 * kilocalorie" idivf1="1.0"></Proper>
<Proper smirks="[*:1]-[#7X3:2]~[#6X3:3](~[#7])~[#7:4]" periodicity1="2" phase1="180.0 * degree" id="t87a" k1="3.337310191161 * mole**-1 * kilocalorie" idivf1="1.0"></Proper>
Following up on some suggestions from Chris Bayly, it looks like phosphate esters and sulfate esters are assigned correctly because there is only one torsion SMIRKS that matches them. The nitrogen torsions in the amidinium cation are assigned incorrectly.
from openff.toolkit.topology import Molecule
from openff.toolkit.typing.engines.smirnoff import ForceField
ff = ForceField('openff-2.0.0.offxml')
offmol = Molecule.from_smiles('CCCC(=[NH2+])N')
labels = ff.label_molecules(offmol.to_topology())[0]
torsion_label = labels['ProperTorsions']
print(f'SMILES {offmol.to_smiles(mapped=True)}')
torsion_smirks = '[#6:1]-[#6:2]-[#6:3]~[#7:4]'
for match in offmol.chemical_environment_matches(torsion_smirks):
if match in torsion_label:
parameter = torsion_label[match]
else:
parameter = torsion_label[match[::-1]]
print(
f'{match[0]+1:2d} {match[1]+1:2d} {match[2]+1:2d} {match[3]+1:2d} '
f'{parameter.id:4s} {parameter.smirks}'
)
produces
SMILES [H:7][C:1]([H:8])([H:9])[C:2]([H:10])([H:11])[C:3]([H:12])([H:13])[C:4](=[N+:5]([H:14])[H:15])[N:6]([H:16])[H:17]
2 3 4 5 t18 [*:1]-[#6X4:2]-[#6X3:3]=[*:4]
2 3 4 6 t23 [#6X4:1]-[#6X4:2]-[#6X3:3]-[#7X3:4]
Describe the bug Chemically equivalent atoms in which a nonzero charge is delocalized (represented by resonance structures with the formal charge on different atoms) are assigned different torsions by OpenFF 2.0.0. Examples include carboxylate and guanidinium groups.
To Reproduce MWE for the amino acids aspartate (carboxylate) and arginine (guanidinium).
Output The code above produces
For aspartate, the C-C-C=O torsion gets t18, but C-C-C-[O-] gets t17. For symmetric arginine, where the formal charge is on the epsilon nitrogen, both C-[N+]=C-N torsions get t87. For asymmetric arginine, where the formal charge is on one of the zeta nitrogens, the C-N-C=[N+] torsion gets t79, but C-N-C-N gets t75.
Computing environment (please complete the following information): OS is Ubuntu 20.04.5.
openff-toolkit
is 0.10.6.openff-forcefields
is 2.0.0. Output ofconda list
is