openforcefield / openff-interchange

A project (and object) for storing, manipulating, and converting molecular mechanics data.
https://docs.openforcefield.org/projects/interchange
MIT License
69 stars 23 forks source link

Bond information is silently lost in openmm roundtrip for constrained systems #1005

Closed IAlibay closed 1 month ago

IAlibay commented 1 month ago

Description

Understandably, for systems with h-constraints, doing a to_opemm/from_openmm roundtrip leads to a loss in bond information. This in turns can lead to cryptic MissingBondError errors when attempting to further export to to a format type that requires bond information, e.g. gromacs TOP.

Reproduction

from openff.interchange import Interchange
from openff.toolkit import Molecule, ForceField
from openff.units import unit
import numpy as np
import os
os.environ['INTERCHANGE_EXPERIMENTAL'] = "1"

mol = Molecule.from_smiles("CC")
mol.generate_conformers()
sage = ForceField("openff-2.0.0.offxml")
cubic_box = unit.Quantity(30 * np.eye(3), unit.angstrom)

interchange = Interchange.from_smirnoff(topology=[mol], force_field=sage, box=cubic_box)
interchange.positions = mol.conformers[0]

interchange2 = Interchange.from_openmm(interchange.to_openmm(), interchange.to_openmm_topology())
for at in  interchange2.topology.atoms:
    resid = int(at.metadata['residue_number'])
    at.metadata['residue_number'] = str(resid+1)
interchange2.to_top('a.top')

Output

This is mostly silent until you get a MissingBondError: Failed to find parameters for bond with topology indices (0, 2)

Proposed solutions

The following would be great:

  1. Document this behaviour in the user guide
  2. If there's a way of doing it, warn users during from_openmm that there might be constrained bonds that aren't being recorded into the Interchange object.

Software versions

mattwthompson commented 1 month ago

This reminds me of the classic error of the same sort we used to get in years past: https://github.com/openforcefield/openff-toolkit/issues/603 (using different terms, so not easy to search for if not already in one's memory bank)

+1 on documenting this quirk, but at times I've wondered if it should be an error instead of a warning. Sometimes those force constants aren't needed, sometimes they definitely are. Maybe there just needs to be a better way to pre-empt the exception in GROMACS, since we don't need to go all the way to writing bond parameters to know it's going to fail ...

IAlibay commented 1 month ago

@mattwthompson - this is more of a user question: do you know if the openmm constrained water -> interchange -> gromacs has been thoroughly checked? (before I spend my own cycles on it).

Our current approach is just to create an unconstrained system w/ rigidWaters=True, which I think then converts fine, but if it's not a super well checked path we can spend a few cycles verifying outputs.

IAlibay commented 1 month ago

at times I've wondered if it should be an error instead of a warning

My 2 cents - given it's an experimental feature: strong opinions (i.e. "this will be a problem in most cases so I error out") is perfectly ok. At least in all our use cases I'd prefer being limited in what I do than accidentally walk through a minefield 😅

mattwthompson commented 1 month ago

openmm constrained water -> interchange -> gromacs

Checked? Yeah

Thoroughly? I wouldn't go that far. If there's a few use cases you're considering and can pass on to me, I'd happily have a look myself

IAlibay commented 1 month ago

If there's a few use cases you're considering and can pass on to me, I'd happily have a look myself

The use cases are all pretty much the same; protein + ligand in "a water" (I can send some input files if it's of help though!).

Mostly just; do your standard waters, e.g. TIP3P, TIP4P-ew, TIP5P, SPC/E waters from OpenMM with and without rigidWater turned on convert to the things you'd expect in gromacs.