Closed IAlibay closed 1 month ago
Hmm, yeah that's a bit of a surprises. Each of these things have some reason to behave the way they do but the inconsistency isn't great.
Positions obtained from an interchange object should include all the particles in the system, including virtual sites.
Unfortunately, this is the heart of the matter. This is a consequence of the first-class view mirroring the toolkit's behavior of not knowing about virtual sites but other API points knowing about them. Right now, Interchange.topology
can't store virtual sites[^1], and I'm not sure how I feel about Interchange.positions
having a different shape than the number of atoms.[^2]
Various API points for getting positions should all lead to the same outcomes
Lots of API points have arguments like include_virtual_sites
wired through them since there are some use cases that only want the atomic particles; I'm not sure how this would implemented on something that's a pydantic.Field
, but maybe there's a way to wrap something up that handles both cases?
At very least we should document all of this better, but I'm not sure how best to do that since this touches so many parts of the codebase.
[^1]: This is because it's a Topology
object from the toolkit, and its representation of molecules does not store virtual sites. I actually advocated for this change, since molecules don't have virtual sites and I'm not sure how one would usefully describe virtual sites in the absence of a force field.
[^2]: In the past, we've talked about the need for a "post-parametrization" representation of a topology. Including virtual sites, which are stored elsewhere in exporters and in pretty fly-by-night ways, is a key use case that would be enabled by this change. I think some of the particle accounting would also be made easier, both internally and for downstream users.
At very least we should document all of this better,
To me this sounds like a good starting solution. I think there's a bit of inconsistency in the docs (or maybe I'm looking at the wrong code examples), so if everything encourages to_openmm_positions, then it shouldn't be as easy for folks to fall into this trap.
Wow, I see I interpreted this originally to be about the difference between .to_openmm_positions
and .topology.positions.to_openmm()
(even though the API is slightly different) but you were actually getting caught on .positions.to_openmm()
. I'm punting on really fixing that since, as mentioned above, just documenting things is a better short-term solution.
In the future, we ought to have some sort of InterchangeTopology
subclass that papers over these issues, and maybe something similar with the positions attribute.
In any case, I've changed the default behavior of the OpenMM positions getter to exclude virtual sites by default - arguably what it should have always done - to avoid this particular API inconsistency. I agree that including virtual sites by default is the better of the two options, but flipping that switch requires more internal refactors that I'm unlikely to be able to get to soon.
Calling this resolved (for now) with #1074, but see above for ideas of future work
Expectation
Description
Interchange.positions.to_openmm()
yields positions without virtual sites.to_openmm_positions
yields positions with virtual sites.Reproduction
Output
Software versions
Interchange v0.3.29