Open mattwthompson opened 6 months ago
Thanks for this!
If I understand this correctly, the first thing you are asking is for us to update the documentation. Do you have a suggested doc string? At first I was going to summarize the openmm doc and then include a link to it, but it sounds like even the openmm doc isn't great, so do you have a suggestion? Or, what would you like to see mentioned?
Yeah sorry this is all over the place; it'd be much easier if it was an OpenFE or OpenMM documentation issue and not a mixture of both
The best suggestion I've thought of would be a dressed-up version of
"""
Minimum distance from any solute atoms to the solvent box edge.
This argument is passed directly to OpenMM's addSolvent (link), which
typically produces boxes of size twice the padding distance plus the approximate
solute size, but for some small solutes may treat the solute as having zero size.
"""
This would communicate the quirk I'm observing in a way that squares the responsibility with OpenMM but doesn't assign blame. I think paraphrasing the OpenMM docs risks this docstrings falling out of sync each other, whereas just a link wouldn't be as useful to the OpenFE user. (If the behavior changes in OpenMM, this is a different issue altogether). That covers everything I have been thinking of, the most important thing being that it describes what happens
Thanks @mattwthompson. We could/should add it to this PR: https://github.com/OpenFreeEnergy/openfe/pull/673, unfortunately it didn't make it into 1.0, but the ability to use the wider set of inputs from addSolvent alongside better docstring would be great.
Sorry for the long explanation - the actual issue here is incredibly pointed and I want to make sure I'm highlighting only it.
OpenFE's docstring of the padding option in solvation settings is:
https://github.com/OpenFreeEnergy/openfe/blob/695d03208bc429fc282e4332ce668b6911111889/openfe/protocols/openmm_utils/omm_settings.py#L53
This setting appears to be passed 1:1 to OpenMM without modification
OpenMM's documentation of a similar argument is a little more verbose
For some small molecules, however, this produces results as if the bounding sphere representing the solute is size zero, which doesn't intuitively hold up to me. OpenFF has a similar implementation (and documented argument) that goes through Packmol after determining the expected system size, and for similar systems it frequently produces boxes a few Angstroms larger. I'm running small tests with a benzene-in-water system, and OpenFE's code path produces box vectors of 2.4 nm (simply twice the padding value of 1.2) whereas the OpenFF code path produces numbers closer to 2.8 nm (twice the padding value plus a little bit of fuzziness representing the size of the molecule). Maybe for a single atom it makes sense, but for a 3D molecule with some size to it, I don't think the box vectors should be twice the padding value in all three dimensions. Unfortunately this seems to be what OpenMM does (see below).
Both OpenFF and OpenFE code paths produce sensible results (no crashes) as far as I've observed, so I don't think it makes sense to ask for a behavior change. The documentation doesn't communicate that there should be different behavior, so I think that could be updated in some way. I have no reason to believe results would be different, but accidentally packing into a larger-than-needed-box is an easy way to slow down simulations with no benefit ((2.8 / 2.4) ** 3 ~= 1.6, or 60% slower compute!).
Here's a rough demonstration of how the OpenMM code path produces different system sizes with molecules of different size
If it would be useful I can