openforcefield / openff-interchange

A project (and object) for storing, manipulating, and converting molecular mechanics data.
https://docs.openforcefield.org/projects/interchange
MIT License
71 stars 23 forks source link

Amber needs bond force constants, even for rigid water models #732

Closed mattwthompson closed 1 week ago

mattwthompson commented 1 year ago

Description

Consider starting from GROMACS files of something containing a fixed three-site water model. These files include sufficient information to construct a chemical graph that includes information about which atoms are bonded to which atoms and which atom pairs are constrained and at what distance. This information is stored:

In [1]: from openff.interchange import Interchange

In [2]: %env INTERCHANGE_EXPERIMENTAL=1
env: INTERCHANGE_EXPERIMENTAL=1

In [3]: [
   ...:     *Interchange.from_gromacs(
   ...:         topology_file="water-dimer.top", gro_file="water-dimer.gro"
   ...:     ).topology.bonds
   ...: ]
Out[3]:
[<openff.toolkit.topology._mm_molecule._SimpleBond at 0x1728a42e0>,
 <openff.toolkit.topology._mm_molecule._SimpleBond at 0x1728a43d0>,
 <openff.toolkit.topology._mm_molecule._SimpleBond at 0x1728a47f0>,
 <openff.toolkit.topology._mm_molecule._SimpleBond at 0x1728a48e0>]

In [4]:

In [4]: [
   ...:     *Interchange.from_gromacs(
   ...:         topology_file="water-dimer.top", gro_file="water-dimer.gro"
   ...:     )["Constraints"]
   ...: ]
Out[4]:
[('type', 'Constraints'),
 ('is_plugin', False),
 ('expression', ''),
 ('key_map',
  {BondKey with atom indices (0, 1): PotentialKey associated with handler 'Constraints' with id 'O-H-settles',
   BondKey with atom indices (0, 2): PotentialKey associated with handler 'Constraints' with id 'O-H-settles',
   BondKey with atom indices (1, 2): PotentialKey associated with handler 'Constraints' with id 'H-H-settles',
   BondKey with atom indices (3, 4): PotentialKey associated with handler 'Constraints' with id 'O-H-settles',
   BondKey with atom indices (3, 5): PotentialKey associated with handler 'Constraints' with id 'O-H-settles',
   BondKey with atom indices (4, 5): PotentialKey associated with handler 'Constraints' with id 'H-H-settles'}),
 ('potentials',
  {PotentialKey associated with handler 'Constraints' with id 'O-H-settles': Potential(parameters={'distance': <Quantity(0.0971676331, 'nanometer')>}, map_key=None),
   PotentialKey associated with handler 'Constraints' with id 'H-H-settles': Potential(parameters={'distance': <Quantity(0.151390065, 'nanometer')>}, map_key=None)})]

and, since the force constant of the H-O bond is not specified for rigid water models, there is no associated harmonic bond interaction:

In [5]: [
   ...:     *Interchange.from_gromacs(
   ...:         topology_file="water-dimer.top", gro_file="water-dimer.gro"
   ...:     )["Bonds"]
   ...: ]
Out[5]:
[('type', 'Bonds'),
 ('is_plugin', False),
 ('expression', 'k/2*(r-length)**2'),
 ('key_map', {}),
 ('potentials', {})]

This is not an issue for OpenMM or GROMACS evaluation, since they accurately understand that they don't need a bond force constant:

In [7]: from openff.interchange.drivers import *

In [8]: get_openmm_energies(
   ...:     Interchange.from_gromacs(
   ...:         topology_file="water-dimer.top", gro_file="water-dimer.gro"
   ...:     )
   ...: )
Out[8]: EnergyReport(energies={'Bond': <Quantity(0.0, 'kilojoule / mole')>, 'Angle': <Quantity(0.0, 'kilojoule / mole')>, 'Torsion': <Quantity(0.0, 'kilojoule / mole')>, 'Nonbonded': <Quantity(-13.1543274, 'kilojoule / mole')>})

In [9]: get_gromacs_energies(
   ...:     Interchange.from_gromacs(
   ...:         topology_file="water-dimer.top", gro_file="water-dimer.gro"
   ...:     )
   ...: )
/Users/mattthompson/software/openff-interchange/openff/interchange/components/mdconfig.py:326: UserWarning: Ambiguous failure while processing constraints. Constraining h-bonds as a stopgap.
  warnings.warn(
Out[9]: EnergyReport(energies={'Bond': <Quantity(0.0, 'kilojoule / mole')>, 'Angle': <Quantity(0.0, 'kilojoule / mole')>, 'Torsion': <Quantity(0.0, 'kilojoule / mole')>, 'RBTorsion': <Quantity(0.0, 'kilojoule / mole')>, 'vdW': <Quantity(-0.18334013, 'kilojoule / mole')>, 'Electrostatics': <Quantity(-13.0003176, 'kilojoule / mole')>})

But Amber, which mixes the topology (which atoms are bonded to which) and parameters (what is the force constant of each harmonic bond) chokes:

In [10]: get_amber_energies(
    ...:     Interchange.from_gromacs(
    ...:         topology_file="water-dimer.top", gro_file="water-dimer.gro"
    ...:     )
    ...: )
/Users/mattthompson/software/openff-interchange/openff/interchange/components/mdconfig.py:326: UserWarning: Ambiguous failure while processing constraints. Constraining h-bonds as a stopgap.
  warnings.warn(
---------------------------------------------------------------------------
SanderError                               Traceback (most recent call last)
Cell In[10], line 1
----> 1 get_amber_energies(
      2     Interchange.from_gromacs(
      3         topology_file="water-dimer.top", gro_file="water-dimer.gro"
      4     )
      5 )

File ~/software/openff-interchange/openff/interchange/drivers/amber.py:48, in get_amber_energies(interchange, writer, detailed)
     22 def get_amber_energies(
     23     interchange: Interchange,
     24     writer: str = "internal",
     25     detailed: bool = False,
     26 ) -> EnergyReport:
     27     """
     28     Given an OpenFF Interchange object, return single-point energies as computed by Amber.
     29
   (...)
     45
     46     """
     47     return _process(
---> 48         _get_amber_energies(
     49             interchange=interchange,
     50             writer=writer,
     51         ),
     52         detailed=False,
     53     )

File ~/software/openff-interchange/openff/interchange/drivers/amber.py:75, in _get_amber_energies(interchange, writer)
     72 mdconfig = MDConfig.from_interchange(interchange)
     73 mdconfig.write_sander_input_file("run.in")
---> 75 return _run_sander(
     76     prmtop_file="out.prmtop",
     77     inpcrd_file="out.inpcrd",
     78     input_file="run.in",
     79 )

File ~/software/openff-interchange/openff/interchange/drivers/amber.py:126, in _run_sander(inpcrd_file, prmtop_file, input_file)
    123 _, err = sander.communicate()
    125 if sander.returncode:
--> 126     raise SanderError(err)
    128 return _parse_amber_energy("mdinfo")

SanderError: At line 655 of file /Users/runner/miniforge3/conda-bld/ambertools_1683278257006/work/AmberTools/src/sander/rdparm.F90 (unit = 8, file = 'out.prmtop')
Fortran runtime error: Bad value during integer read

Error termination. Backtrace:
#0  0x18019d2b6
#1  0x18019e1a5
#2  0x18019ee5b
#3  0x180777fc3
#4  0x18077e841
#5  0x18077fd93
#6  0x1008faced
#7  0x1008d3f76
#8  0x1008d1e43
#9  0x1008d1eab

This traceback is not super useful; the issue is that BONDS_INC_HYDROGEN needs to be populated to describe the bond graph but can't be populated with defined parameters.

In [15]: parmed.load_file("water-dimer.prmtop")
---------------------------------------------------------------------------
AmberError                                Traceback (most recent call last)
Cell In[15], line 1
----> 1 parmed.load_file("water-dimer.prmtop")

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/formats/registry.py:194, in load_file(filename, *args, **kwargs)
    192     _prune_argument(cls.parse, kwargs, 'hasbox')
    193     _prune_argument(cls.parse, kwargs, 'skip_bonds')
--> 194     return cls.parse(filename, *args, **kwargs)
    195 elif hasattr(cls, 'open_old'):
    196     _prune_argument(cls.open_old, kwargs, 'structure')

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/amber/amberformat.py:353, in AmberFormat.parse(filename, *args, **kwargs)
    351 from ._tinkerparm import BeemanRestart
    352 try:
--> 353     return LoadParm(filename, *args, **kwargs)
    354 except (IndexError, KeyError):
    355     parm = AmberFormat(filename, *args, **kwargs)

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/amber/readparm.py:48, in LoadParm(parmname, xyz, box)
     46     parm = parm.view_as(AmoebaParm)
     47 else:
---> 48     parm = parm.view_as(AmberParm)
     50 if isinstance(xyz, str):
     51     f = load_file(xyz)

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/amber/amberformat.py:420, in AmberFormat.view_as(self, cls)
    418 if type(self) is cls:
    419     return self
--> 420 return cls.from_rawdata(self)

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/amber/_amberparm.py:233, in AmberParm.from_rawdata(cls, rawdata)
    231 inst.parm_comments = rawdata.parm_comments
    232 inst.flag_list = rawdata.flag_list
--> 233 inst.initialize_topology()
    234 # See if the rawdata has any kind of structural attributes, like
    235 # coordinates and an atom list with positions and/or velocities
    236 if hasattr(rawdata, 'coordinates'):

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/amber/_amberparm.py:185, in AmberParm.initialize_topology(self, xyz, box)
    183         self.combining_rule = 'lorentz'
    184 # Instantiate the Structure data structures
--> 185 self.load_structure()
    187 if isinstance(xyz, str):
    188     f = load_file(xyz, skip_bonds=True)

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/amber/_amberparm.py:478, in AmberParm.load_structure(self)
    472 def load_structure(self):
    473     """
    474     Loads all of the topology instance variables. This is necessary if we
    475     actually want to modify the topological layout of our system
    476     (like deleting atoms)
    477     """
--> 478     self._check_section_lengths()
    479     self._load_atoms_and_residues()
    480     self.load_atom_info()

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/amber/_amberparm.py:1293, in AmberParm._check_section_lengths(self)
   1291 check_length('SCNB_SCALE_FACTOR', self.ptr('NPTRA'), False)
   1292 check_length('SOLTY', self.ptr('NATYP'))
-> 1293 check_length('BONDS_INC_HYDROGEN', self.ptr('NBONH')*3)
   1294 check_length('BONDS_WITHOUT_HYDROGEN', self.ptr('MBONA')*3)
   1295 check_length('ANGLES_INC_HYDROGEN', self.ptr('NTHETH')*4)

File ~/mambaforge/envs/openff-interchange-env/lib/python3.9/site-packages/parmed/amber/_amberparm.py:1257, in AmberParm._check_section_lengths.<locals>.check_length(key, length, required)
   1255     return
   1256 if len(self.parm_data[key]) != length:
-> 1257     raise AmberError(f'FLAG {key} has {len(self.parm_data[key])} elements; expected {length}')

AmberError: FLAG BONDS_INC_HYDROGEN has 0 elements; expected 12

ParmEd works around this by assigning a (presumably) arbitrary force constant of 50,000 $\frac{kJ}{{nm}^2}$ to these bonds, including an H-H "bond."

mattwthompson commented 10 months ago

Here's another way of representing the same problem - just water, using a rigid model:

from openff.toolkit import ForceField, Molecule
from openff.units import Quantity, unit

from openff.interchange.components._packmol import UNIT_CUBE, pack_box
from openff.interchange.drivers import get_amber_energies, get_openmm_energies

water = Molecule.from_mapped_smiles("[H:2][O:1][H:3]")
topology = pack_box(
    molecules=[water],
    number_of_copies=[2000],
    mass_density=Quantity(1.0, unit.gram / unit.milliliter),
    box_shape=UNIT_CUBE,
)
out = ForceField("tip3p.offxml").create_interchange(topology)

get_openmm_energies(out)
"""
EnergyReport(energies={'Nonbonded': <Quantity(104273.423, 'kilojoule / mole')>})
"""

get_amber_energies(out)
"""
LookupError: Could not find component Bonds. This object has the following collections registered:
    ['Constraints', 'vdW', 'Electrostatics']
"""

The writer can't write out the topological bonds since there's no physics associated with them. As far as I can tell, prmtop files merge this information, making one impossible without the other.

davidlmobley commented 10 months ago

I believe ParmEd dealt with this in the "correct" AMBER way which is to actually provide bond details for water even if they aren't intended to be used. Ah yes, you caught this above:

ParmEd works around this by assigning a (presumably) arbitrary force constant of 50,000 to these bonds, including an H-H "bond."

I don't like this solution but I think it's the best one for AMBER.

(I'm in a rush so let me know if I missed soemthing important.)

mattwthompson commented 1 week ago

I guess this is about as good as it's going to get