openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
314 stars 92 forks source link

What to do when two installed offxml files have the same relative path from their respective entry points? #326

Open andrrizzi opened 5 years ago

andrrizzi commented 5 years ago

See the discussion here: https://github.com/openforcefield/openforcefield/pull/322#discussion_r285204727

j-wags 39 minutes ago Member

In the case that there are two files that match a given query, can we make any statement about which one will be returned? Maybe something like guaranteeing the order of entry point registration? Or can we blame any ambiguity there on user error?

j-wags 32 minutes ago Member

This was an issue before this PR, but it could be exacerbated by allowing FFs from external sources. I've added a TODO to keep a record of this concern.

andrrizzi 29 minutes ago Author Member

I'm not sure we can control it because, if it's not random, it probably depend on the order the forcefield packages will be installed. I can change the code to raise a warning or an error with a flag to load the first one to make this case a little safer.

j-wags 25 minutes ago Member

This is probably a complicated question, and worthy of further discussion. Like, if multiple FFs with the same name are found, but they have identical contents, maybe that's permissible? This will require some thinking about how the FF community/ecosystem will evolve. Let's open an issue to discuss, but not hold these changes up in the meantime.

jchodera commented 5 years ago

If this happens, we should raise an exception and list all installed versions that match that path, like

Exception: Multiple packages have installed force field files at the specified path 'path/to/forcefield.offxml`:
* packagename1 : path/to/forcefield.offxml
* packagename2 : path/to/forcefield.offxml
To specify a unique choice, qualify the path name with `packagename:path/to/forcefield.offxml` or uninstall one of the conflicting packages.

However, there is one case we can handle without concern: If the files are identical (their MD5 hashes are identical), then we can safely proceed without an exception.

mattwthompson commented 1 year ago

This would be non-trivial to implement error handling of this case and with our current method of distributing force fields an edge case that will only be reached if we make mistakes in deployment. Tracking all files potentially exposed in the plugin system and doing housekeeping operations on the different equality conditions is a significant undertaking. This logic would scales poorly with the number of distributed force fields and may incur performance issues if it must happen before anything is loaded.

If a user happens to have a file of the same name in their local path, I think that's on them. I'm completely comfortable expecting users to avoid collisions with the highly specific file names that are hit when traversing the plugin paths. It bears mention that we have received approximately zero reports of this behavior causing issues for users.

For what it's worth, the current behavior unintentionally prefers files in the local path. It's trivial to adjust the priority of different paths, but I would not advocate for doing so unless we commit to guaranteeing particular behavior (currently no such promises exist).

(test) [tmp] cp ../software/openff-forcefields/openforcefields/offxml/openff-1.0.0.offxml .
(test) [tmp] python                                                   13:20:07
Python 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:55:37)
[Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from openff.toolkit import ForceField
>>> ForceField("openff-1.0.0.offxml")
Loading /Users/mattthompson/tmp/openff-1.0.0.offxml
<openff.toolkit.typing.engines.smirnoff.forcefield.ForceField object at 0x109cf68e0>

I advocate for not planning to implement this.