openforcefield / openff-interchange

A project (and object) for storing, manipulating, and converting molecular mechanics data.
https://docs.openforcefield.org/projects/interchange
MIT License
69 stars 23 forks source link

Boxes from OpenMM `System`s are `OpenFFQuantity[list[list[OpenMMQuantity]]]` #879

Closed Yoshanuikabundi closed 6 months ago

Yoshanuikabundi commented 6 months ago

Description

Loading an OpenMM system with box vectors with the Interchange.from_openmm() method, without specifying the box_vectors argument, takes the box vectors for the new Interchange from the old System. This is a great, helpful feature, but the conversion to OpenFF units is a bit messed up. It usually works fine, but when using the + operator to combine Interchanges with numerically identical boxes it can fail as Quantity(2.0, nanometer) != 2.0

This same unit conversion bug seems to be present in both the OpenFF Models unit conversion function and openff.units.ensure_quantity as well.

Reproduction Here's a notebook that creates an RNA molecule, parametrizes it with OpenMM, and then attempts to combine that with an Interchange created with from_smirnoff

rna_from_rdkit.zip

Output

---------------------------------------------------------------------------
UnsupportedCombinationError               Traceback (most recent call last)
Cell In[53], line 1
----> 1 combined_interchange = chol_int + interchange_from_openmm

File ~/Documents/openff/interchange/openff/interchange/_experimental.py:35, in experimental.<locals>.wrapper(*args, **kwargs)
     26 if os.environ.get("INTERCHANGE_EXPERIMENTAL", "0") != "1":
     27     raise ExperimentalFeatureException(
     28         f"\n\tFunction or method {func.__name__} is experimental. This feature is not "
     29         "complete, not yet reliable, and/or needs more testing to be considered suitable "
   (...)
     32         "INTERCHANGE_EXPERIMENTAL=1.",
     33     )
---> 35 return func(*args, **kwargs)

File ~/Documents/openff/interchange/openff/interchange/components/interchange.py:884, in Interchange.__add__(self, other)
    881     self_copy.positions = None
    883 if not np.all(self_copy.box == other.box):
--> 884     raise UnsupportedCombinationError(
    885         "Combination with unequal box vectors is not curretnly supported",
    886     )
    888 return self_copy

UnsupportedCombinationError: Combination with unequal box vectors is not curretnly supported
>>> print(chol_int.box)
[[2.0 0.0 0.0] [0.0 2.0 0.0] [0.0 0.0 2.0]] nanometer
>>> print(interchange_from_openmm.box)
[[Quantity(value=2.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)  Quantity(value=0.0, unit=nanometer)] [Quantity(value=0.0, unit=nanometer) Quantity(value=2.0, unit=nanometer)  Quantity(value=0.0, unit=nanometer)] [Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)  Quantity(value=2.0, unit=nanometer)]] nanometer

Software versions

mattwthompson commented 6 months ago

Seems valid to me; I don't think this specific case has been tested, so I'm not surprised.

mattwthompson commented 6 months ago

@Yoshanuikabundi any opinion if, in the case of box_vectors=None, the box vectors should be inferred from the system, topology, or both (with some precedence)? If I'm not mistaken, both OpenMM objects can contain box information but neither must.

mattwthompson commented 6 months ago

I'm not sure where in the attached notebook, maybe there's a different version you mean to upload?

With the current code I'm struggling to see how this could come about. There's no processing of the box_vectors argument, just setting it if provided. (Perhaps the validation could be improved here; I can't imagine this data type cropping up by any means but a mangled validator.)

https://github.com/openforcefield/openff-interchange/blob/v0.3.18/openff/interchange/interop/openmm/_import/_import.py#L35-L80

Yoshanuikabundi commented 6 months ago

I believe the system must contain box info if any of the forces rely on periodicity - at least, ForceField.createSystem() raises an exception if you ask it for PME without giving the topology box vectors. I think taking box vectors from the system is appropriate, as that's where OpenMM takes them from, and the whole point is to import that behavior.

Sorry about linking the wrong notebook - I think I somehow just had old outputs. Here's the freshly executed notebook with correctly incorrect outputs (last few cells): rna_from_rdkit.zip

I agree that it's the validator that's doing it - it seems to be a part of the ArrayQuantity.validate_type() call here: https://github.com/openforcefield/openff-interchange/blob/1caa3d44e840f904791a48955a31981560d85f43/openff/interchange/components/interchange.py#L161C22-L161C56

mattwthompson commented 6 months ago

openff-models 0.1.2 should fix this, I think without any code changes in Interchange itself. Those builds should be online within the hour (https://github.com/conda-forge/openff-models-feedstock/pull/16)

mattwthompson commented 6 months ago

(And here's a comment from this morning which I neglected to send then!)

I believe the system must contain box info if any of the forces rely on periodicity ...

Agreed

I can't get that specific error from that notebook, either, but it wasn't too hard to hunt down the specific tension. Strictly conflicting with the documentation, openmm.System.getDefaultPeriodicBoxVectors returns list[openmm.unit.Quantity[Vec3]], something that the validator is somehow not prepared to work around. Probably because it's as weird case - usually a list-like object is being wrapped, not doing the wrapping.

mattwthompson commented 6 months ago

Looking like this is fixed just with the openff-models release; eventually this will be tested by changes in #883

Yoshanuikabundi commented 6 months ago

@mattwthompson Something strange is still happening - if I use ArrayQuantity.validate_type before passing the box to Interchange.from_openmm, everything's fine, but if I just pass the box straight from the system, or let from_openmm do it automatically, I still get a mangled box type.

>>> interchange_from_openmm = Interchange.from_openmm(
...     topology, 
...     system, 
...     positions, 
...     box_vectors=ArrayQuantity.validate_type(system.getDefaultPeriodicBoxVectors()),
... )
>>> interchange_from_openmm.box
<Quantity([[5.5319 0.     0.    ], [0.     5.9681 0.    ], [0.     0.     6.7198]], 'nanometer')>
>>> interchange_from_openmm = Interchange.from_openmm(
...     topology, 
...     system, 
...     positions, 
...     # Same result if next line is removed
...     box_vectors=system.getDefaultPeriodicBoxVectors(),
... )
>>> interchange_from_openmm.box
<Quantity([[Quantity(value=5.5319, unit=nanometer),  Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)], [Quantity(value=0.0, unit=nanometer),  Quantity(value=5.9681, unit=nanometer),  Quantity(value=0.0, unit=nanometer)], [Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer),  Quantity(value=6.719799999999999, unit=nanometer)]], 'nanometer')>

This seems to happen before the box validator runs - if I print debug the box validator and the from_openmm function:

# In openff.interchange.interop.openmm._import._import.from_openmm:
...
    elif system is not None:
        print("box from system", system.getDefaultPeriodicBoxVectors())
        interchange.box = system.getDefaultPeriodicBoxVectors()
...

# In openff.interchange.components.interchange.Interchange.validate_box:
...
        if value is None:
            return value
        print("validating", value)
        first_pass = ArrayQuantity.validate_type(value)
        print("first_pass", first_pass)
...
        print("validated", box)

        return box
...

I get:

>>> interchange_from_openmm = Interchange.from_openmm(
...     topology, 
...     system, 
...     positions,
... )
box from system [Quantity(value=Vec3(x=5.5319, y=0.0, z=0.0), unit=nanometer), Quantity(value=Vec3(x=0.0, y=5.9681, z=0.0), unit=nanometer), Quantity(value=Vec3(x=0.0, y=0.0, z=6.719799999999999), unit=nanometer)]

validating [[Quantity(value=5.5319, unit=nanometer)  Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)] [Quantity(value=0.0, unit=nanometer)  Quantity(value=5.9681, unit=nanometer)  Quantity(value=0.0, unit=nanometer)] [Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)  Quantity(value=6.719799999999999, unit=nanometer)]] nanometer

first_pass [[Quantity(value=5.5319, unit=nanometer)  Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)] [Quantity(value=0.0, unit=nanometer)  Quantity(value=5.9681, unit=nanometer)  Quantity(value=0.0, unit=nanometer)] [Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)  Quantity(value=6.719799999999999, unit=nanometer)]] nanometer

validated [[Quantity(value=5.5319, unit=nanometer)  Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)] [Quantity(value=0.0, unit=nanometer)  Quantity(value=5.9681, unit=nanometer)  Quantity(value=0.0, unit=nanometer)] [Quantity(value=0.0, unit=nanometer) Quantity(value=0.0, unit=nanometer)  Quantity(value=6.719799999999999, unit=nanometer)]] nanometer

So I have no idea what's going on there, but it does suggest an easy-but-hacky fix of just calling the validate_type method when setting the box vectors in from_openmm.

Here's a much more minimal reproducing notebook: box_vecs_from_openmm.zip

mattwthompson commented 6 months ago

Hmm, okay, more test cases to include. I'm not sure why that's failing; it's meant to just be validated when it's set, and either code path

https://github.com/openforcefield/openff-interchange/blob/2da48d03f19cd058eab721345bb782995c5d9b41/openff/interchange/interop/openmm/_import/_import.py#L79-L82

should hit the same validator

https://github.com/openforcefield/openff-interchange/blob/2da48d03f19cd058eab721345bb782995c5d9b41/openff/interchange/components/interchange.py#L155-L174

mattwthompson commented 6 months ago

Lovely, the discrepancy is that this behavior is in ArrayQuantity.validate_type but not from_openmm:

In [8]: system = openmm.XmlSerializer.deserialize(open("system.xml").read())

In [9]: Interchange.from_openmm(
   ...:     system=system, box_vectors=system.getDefaultPeriodicBoxVectors()
   ...: ).box == Interchange.from_openmm(system=system).box
Out[9]:
array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

In [10]: from openff.units.openmm import from_openmm

In [11]: Interchange.from_openmm(
    ...:     system=system, box_vectors=from_openmm(system.getDefaultPeriodicBoxVectors())
    ...: ).box == Interchange.from_openmm(system=system).box
Out[11]:
array([[False, False, False],
       [False, False, False],
       [False, False, False]])
mattwthompson commented 6 months ago

Haha, shame on me for thinking that validate_assignment=True meant that assignments would be validated!


In [5]: tmp = Interchange()

In [6]: tmp.Config.validate_assignment
Out[6]: True

In [7]: tmp.box = system.getDefaultPeriodicBoxVectors()

In [8]: tmp.box
Out[8]:
array([[Quantity(value=2.5, unit=nanometer),
        Quantity(value=0.0, unit=nanometer),
        Quantity(value=0.0, unit=nanometer)],
       [Quantity(value=0.0, unit=nanometer),
        Quantity(value=2.5, unit=nanometer),
        Quantity(value=0.0, unit=nanometer)],
       [Quantity(value=0.0, unit=nanometer),
        Quantity(value=0.0, unit=nanometer),
        Quantity(value=2.5, unit=nanometer)]], dtype=object) <Unit('nanometer')>

In [9]: tmp.box = Interchange.validate_box(system.getDefaultPeriodicBoxVectors())

In [10]: tmp.box
Out[10]:
array([[2.5, 0. , 0. ],
       [0. , 2.5, 0. ],
       [0. , 0. , 2.5]]) <Unit('nanometer')>

In [22]: Interchange.__validators__["box"][0].func
Out[22]: <function openff.interchange.components.interchange.Interchange.validate_box(cls, value) -> Optional[pint.util.Quantity]>
mattwthompson commented 6 months ago

And I guess the system objects must contain some sort of box information. I got it in my head at some point that these could be none:

In [5]: system.setDefaultPeriodicBoxVectors(None, None, None)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-a0c46415c88c> in ?()
----> 1 system.setDefaultPeriodicBoxVectors(None, None, None)

~/micromamba/envs/openff-interchange-dev/lib/python3.11/site-packages/openmm/openmm.py in ?(self, a, b, c)
  14362         if unit.is_quantity(c):
  14363             c = c.value_in_unit(unit.nanometer)
  14364
  14365
> 14366         return _openmm.System_setDefaultPeriodicBoxVectors(self, a, b, c)

ValueError: in method System_setDefaultPeriodicBoxVectors, argument 2 could not be converted to type Vec3 const &
mattwthompson commented 6 months ago

Okay, let me be more concise here, after #895 (might make it into a 0.3.20 today)

interchange_from_openmm = Interchange.from_openmm(
    topology, 
    system, 
    positions, 
)
interchange_from_openmm.box # mangled

this one is now fixed

interchange_from_openmm = Interchange.from_openmm(
    topology, 
    system, 
    positions, 
    box_vectors=ArrayQuantity.validate_type(system.getDefaultPeriodicBoxVectors()),
)
interchange_from_openmm.box # perfect

still perfect 🤞

interchange_from_openmm = Interchange.from_openmm(
    topology, 
    system, 
    positions, 
    box_vectors=system.getDefaultPeriodicBoxVectors(),
)
interchange_from_openmm.box # mangled

this one is now fixed

from openff.units import ensure_quantity

ensure_quantity(system.getDefaultPeriodicBoxVectors(), 'openff') # mangled

this one is not fixed; for now I'm waving my hands and saying that this isn't supported. The contortions that need to take place to support this are undesirable, and I hope (perhaps with some more documentation and parsing errors) this isn't needed given the above ways Interchange.from_openmm can digest box information