Closed j-wags closed 2 years ago
This is a great idea! I like the first option, but calling it citable_name
instead of name
so that it's clear that we can use that name to refer to the force field. It should ideally be the first line in the force field file, and we should likely not include it for non-citable releases.
I like John's idea.
Except I have a slight parsing problem with "citable_name" because my brain says, "What's a CI Table?". Maybe that's just me though. I feel like my brain would do better with "formal_name" or "forcefield_name" or "citation_name" or something.
I transferred this over from the toolkit; even though it's two years old now, it's worth either considering it or shutting the door on it.
people will very likely report the wrong version/FF name when using Parsley, as openff-1.0.0.offxml begins with ... So, we'll probably see some variants of "We used SMIRNOFF 0.3 for this simulation" in the future.
Has this happened? I have only seen "OpenFF 1.0" at times squishier things like "the open force field version 1". As far as I've heard, this hasn't been a major issue in the force field releases that we've done. People presumably already know the version of the force field they used from the version of the openff-forcefields
package they have installed or what they access in the ForceField
constructor. People seem to be able to access to the version of the force field they use, and our force fields' names (or code names?) aren't really adding new information - do we need name attributes in the file and object?
There is also the problem of this information persisting in the ForceField
object, which is (intentionally) malleable.
>>> from openff.toolkit.typing.engines.smirnoff import ForceField
>>> sage = ForceField("openff-2.0.0.offxml")
>>> sage['Bonds'].parameters[0].k
<Quantity(529.242972, 'kilocalorie / angstrom ** 2 / mole')>
>>> sage['Bonds'].parameters[0].k *= 2
>>> sage['Bonds'].parameters[0].k
<Quantity(1058.48594, 'kilocalorie / angstrom ** 2 / mole')>
>>> sage.citable_name
If this returned 2.0.0
or something, that would be bad.
It would also be bad if somebody wrote out this file to openff-2.0.0.offxml
and distributed that file somewhere.
This seems like something that's trivial on the surface but runs into complexity when considering the details. If there isn't a huge need for this, I would rather this stay un-supported (as it has for however long now).
This has not happened to my knowledge. I think since the file names are clear, people have not used the version in the XML tags as a force field version number. It also helps that these version numbers are rather distinct, e.g. we weren't using force field version 1.2 with spec version 1.1 or something similar. Anyone who MIGHT be tempted by this has probably noticed spec version 0.3 with force field version 1.x or 2.0 and interpreted it correctly.
I think I'm with you that this is complex to implement well AND not something which has caused problems so probably not a problem we need to fix, though I initially thought we should.
I'm closing this as it seems there's not a compelling need to take action and nobody has taken up that mantle. (In my experience, a vanishingly small portion of users open up the OFFXML files directly, so the fact that our ForceField
class sweeps the <?xml version="1.0" encoding="utf-8"?>
line under the rug mostly sidesteps this issue.)
If this does become an issue in the future, people will likely come across it with the right search:
SMIRNOFF version vs. force field version vs. XML version vs. OFFXML version SMIRNOFF versioning vs. force field versioning vs. XML versioning vs. OFFXML versioning
Won't this be a problem if we serialize force fields for transmission, or pickle their contents into objects? The filename is not guaranteed to be associated with them in any of these cases.
Is your feature request related to a problem? Please describe. @cbayly13 points out that people will very likely report the wrong version/FF name when using Parsley, as
openff-1.0.0.offxml
begins withSo, we'll probably see some variants of "We used SMIRNOFF 0.3 for this simulation" in the future. To avoid this, it would be good to have some record of the FF name inside of the file contents.
One major constraint is that the
ForceField
object currently has no memory of the original file name, thus it couldn't return the FF name if it tried. And, moreso, it's perfectly valid to feed a large XML string containing an entire force field to the constructor, in which case the FF name definitely wouldn't be available.Describe the solution you'd like
This will require some discussion. Some potential solutions are:
ForceField
class, so these would be stripped after one I/O cycle. This may actually be a good thing.