Open GregorySchwing opened 10 months ago
Thanks for the report, this is probably a bug in Interchange, but it might be a deeper incompatibility with the information Amber expects since it has opinions about bond graphs and physics parameters. I think ParmEd assigns arbitrary an force constant of 50,000 kcal/mol or something in cases like these, which is not ideal but maybe the best among the possible options.
The OFFXML files don't have bond information since they are rigid models.
We probably haven't encountered this yet since most use cases involve loading a general force field, which will include some sort of harmonic bond parameters to fall back on (even though, again, those are not meant to be used with these water models). Here's an example of using the TIP3P model, writing to Amber files:
In [1]: from openff.toolkit import Molecule, ForceField
In [2]: ForceField(
"openff-2.0.0.offxml",
"tip3p-1.0.0.offxml",
).create_interchange(
Molecule.from_mapped_smiles(
"[O:1]([H:2])[H:3]"
).to_topology(),
).to_prmtop('foo')
(In the case of TIP3P, loading it up separately is not necessary since Sage bundles it.)
I understand, but I still have issues when I do this
forcefield = ForceField("openff-2.1.0.offxml", "opc-1.0.0.offxml")
and build prmtop and incrd files
Here is an incrd directly from tleap
sh-5.0# cat opc.inpcrd
OPC
4
0.0000000 0.0000000 0.0000000 0.8724330 0.0000000 0.0000000
-0.2051460 0.8479710 0.0000000 0.0985730 0.1252640 0.0000000
Interchange
3 0.0000000e+00
-0.0040000 0.3560000 -0.0000000 -0.8250000 -0.1790000 -0.0000000
0.8290000 -0.1770000 0.0000000
Perhaps related to #783.
Right, but that's for a different reason. Four-site water models aren't implemented in the Amber writers yet (surprisingly, there hasn't been much interest in it yet) - there'll be much more missing than bond information, as you've found out. Amber's support for virtual sites seems to be much more limited than i.e. OpenMM, but doing four-site water models is tractable.
I'm hoping to tackle it later month - keep an eye out for releases or updates to that issue!
Okay - I found the issue with the details I was thinking of: #732
Update. Going through parmed works, or appears to work
# Imports from the toolkit
from openff.toolkit import ForceField, Molecule, Topology
from openff.units import Quantity, unit
from openff.interchange import Interchange
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import rdmolfiles
def embed(mol, seed=None):
params = AllChem.ETKDGv2()
if seed is not None:
params.randomSeed = seed
else:
params.randomSeed = 123
AllChem.EmbedMolecule(mol, params)
return mol
forcefield = ForceField("openff-2.1.0.offxml", "opc-1.0.0.offxml")
openff_mol = Molecule.from_mapped_smiles("[O:1]([H:2])[H:3]")
"""
rdmol = openff_mol.to_rdkit()
rdmol3D = embed(rdmol,123)
# Create a PDB file
writer = rdmolfiles.MolToPDBFile(rdmol3D, "water.pdb")
# Load the topology from a PDB file and `Molecule` objects
topology = Topology.from_pdb(
"water.pdb",
unique_molecules=[openff_mol],
)
"""
# Generate a conformer to be used as atomic coordinates
openff_mol.generate_conformers(n_conformers=1)
# Convert this molecule to a topology
topology = openff_mol.to_topology()
# Define periodicity via box vectors
topology.box_vectors = unit.Quantity([4, 4, 4], unit.nanometer)
interchange = Interchange.from_smirnoff(
force_field=forcefield,
topology=topology,
)
# FAILS!
#interchange.to_prmtop("water.prmtop")
#interchange.to_inpcrd("water.inpcrd")
interchange.to_gro("water.gro")
interchange.to_top("water.top")
from parmed import gromacs, amber, unit as u
gmx_top = gromacs.GromacsTopologyFile('water.top')
gmx_gro = gromacs.GromacsGroFile.parse('water.gro')
gmx_top.box = gmx_gro.box # Needed because .prmtop contains box info
gmx_top.positions = gmx_gro.positions
amb_prm = amber.AmberParm.from_structure(gmx_top)
amb_prm.write_parm("water.prmtop")
amb_inpcrd = amber.AmberAsciiRestart("water.inpcrd", mode="w")
amb_inpcrd.coordinates = gmx_top.coordinates
amb_inpcrd.box = gmx_top.box
amb_inpcrd.close()
Right - this particular behavior is not implemented in the Amber writer. It's implemented in the OpenMM and GROMACS writers. Verison 0.3.18 makes this distinction more clear.
How would you go about getting the water geometry correct? Current angle after generating confomations is 111 degrees. Should be 103.6 degrees. Bond distances are wrong also.
Are you referring to the conformers you generated here?
openff_mol.generate_conformers(n_conformers=1)
These conformers are generated by RDKit (or a similar tool). They don't use a force field, but should be close enough to the MM minima that it'll equilibrate to whatever's specified by the force field. If for some reason you need them to start at a force field minima, you can hard-code the positions/conformers or use Interchange.minimize
.
interchange.to_gro() fails after Interchange.minimize since the virtual site is added by openmm.
print("BEFORE",interchange.positions)
interchange.minimize()
print("AFTER",interchange.positions)
interchange.to_gro("water.gro")
interchange.to_top("water.top")
BEFORE [[-8.161597176171645e-05 0.036637842847512564 -0.0] [-0.08123162006042906 -0.018348210785835466 -0.0] [0.08131323603219076 -0.018289632061677063 0.0]] nanometer
AFTER [[-7.367174112005159e-05 0.03652547672390938 0.0] [-0.06858216226100922 -0.017492877319455147 0.0] [0.06853829324245453 -0.017361382022500038 0.0] [-5.838654760736972e-05 0.02058565244078636 0.0]] nanometer
Traceback (most recent call last):
File "/tmp/test.py", line 58, in <module>
interchange.to_gro("water.gro")
File "/opt/conda/lib/python3.10/site-packages/openff/interchange/components/interchange.py", line 394, in to_gro
GROMACSWriter(
File "/opt/conda/lib/python3.10/site-packages/openff/interchange/interop/gromacs/export/_export.py", line 48, in to_gro
self._write_gro(gro, decimal)
File "/opt/conda/lib/python3.10/site-packages/openff/interchange/interop/gromacs/export/_export.py", line 323, in _write_gro
assert n_particles == self.system.positions.shape[0], (
AssertionError: (4, 5)
I'm going to leave this last comment:
Going through parmed results in an incorrect prmtop file for my purposes. It may run in amber, but I'm trying to create 1D-RISM models. Therefore, the LJ_types need to be correct.
This is just with SPCE 3-point water, so I could test interchange. SPCE has 2 LJ_types, (O and H).
Here is an excerpt of the pointers of the prmtop file from interchange -> gromacs -> parmed -> amber
>>> print(parm.pointers)
{'NATOM': 3, 'NTYPES': 3, 'NBONH': 3, 'MBONA': 0, 'NTHETH': 0, 'MTHETA': 0, 'NPHIH': 0, 'MPHIA': 0, 'NHPARM': 0, 'NPARM': 0, 'NEXT': 4, 'NNB': 4, 'NRES': 1, 'NBONA': 0, 'NTHETA': 0, 'NPHIA': 0, 'NUMBND': 2, 'NUMANG': 0, 'NPTRA': 0, 'NATYP': 1, 'NPHB': 0, 'IFPERT': 0, 'NBPER': 0, 'NGPER': 0, 'NDPER': 0, 'MBPER': 0, 'MGPER': 0, 'MDPER': 0, 'IFBOX': 1, 'NMXRS': 3, 'IFCAP': 0, 'NUMEXTRA': 0, 'IPTRES': 1, 'NSPM': 1, 'NSPSOL': 2}
>>> parm.LJ_types
{'MOL0': 3}
>>> print(parm.LJ_depth)
[0.15210000020871914, 0, 0]
>>> print(parm.LJ_radius)
[1.7682705875656692, 0, 0]
interchange -> amber
>>> print(parm.pointers)
{'NATOM': 3, 'NTYPES': 2, 'NBONH': 2, 'MBONA': 0, 'NTHETH': 1, 'MTHETA': 0, 'NPHIH': 0, 'MPHIA': 0, 'NHPARM': 0, 'NPARM': 0, 'NEXT': 4, 'NNB': 4, 'NRES': 1, 'NBONA': 0, 'NTHETA': 0, 'NPHIA': 0, 'NUMBND': 1, 'NUMANG': 1, 'NPTRA': 0, 'NATYP': 1, 'NPHB': 0, 'IFPERT': 0, 'NBPER': 0, 'NGPER': 0, 'NDPER': 0, 'MBPER': 0, 'MGPER': 0, 'MDPER': 0, 'IFBOX': 1, 'NMXRS': 3, 'IFCAP': 0, 'NUMEXTRA': 0, 'IPTRES': 1, 'NSPM': 1, 'NSPSOL': 2, 'NCOPY': 0}
>>> parm.LJ_types
{'O1': 1, 'H1': 2, 'H2': 2}
>>> print(parm.LJ_depth)
[0.15210000020871914, 0]
>>> print(parm.LJ_radius)
[1.7682705875656692, 0]
The only way to get a correct prmtop file is for interchange to write the amber file. Parmed messes up the LJ_types.
Thanks for all the hacking - I'm trying to distill everything down to specific things I can fix. Interchange by necessity makes some approximations with atom types (SMIRNOFF does not have atom types whereas most other tools, especially anything Amber-related have them deeply built-in) and one of the consequences is that GROMACS files have more atom types than strictly necessary, i.e.
[ atomtypes ]
;type, bondingtype, atomic_number, mass, charge, ptype, sigma, epsilon
MOL0_0 8 15.99943 0.0000000000000000 A 0.31507 0.6363864000000001
MOL0_1 1 1.007947 0.0000000000000000 A 0.09999999999999999 0
MOL0_2 1 1.007947 0.0000000000000000 A 0.09999999999999999 0
ParmEd is probably parsing what's in the file, which looks funny (identical atom types, with no interactions) by isn't incorrect for running simulations. (Maybe something's out of whack with OPC due to the virtual site, I've only run this with a three-site model.)
Seems the problem is in Parmed's Gromacs to Amber converter upon inspecting your gromacs file. Further, my last post was using a 3-site water, so it's not caused by virtual sites.
I could maybe use a context reset to better understand what exactly is going wrong here (when virtual sites are not present)
Describe the bug I'd like to be able to write amber parameter files using interchange from openff-toolkit. However, no bond information is included in the water xmls leading to failure.
To Reproduce
If the problem involves a specific molecule or file, please upload that as well. -->
Output
Computing environment (please complete the following information): openff-toolkit 0.14.4