openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
311 stars 90 forks source link

Re-export `unit` and `Quantity` from toolkit? #1774

Closed mattwthompson closed 10 months ago

mattwthompson commented 10 months ago

Is your feature request related to a problem? Please describe.

Most of the code I write starts with something like this

from openff.toolkit import Molecule, ForceField
from openff.units import unit, Quantity

topology = Molecule.from_x("x").to_topology()
topology.box_vectors = [4, 4, 4] * unit.nanometer

@Yoshanuikabundi's quick-import/re-exports of common classes implemented in https://github.com/openforcefield/openff-toolkit/pull/1192 were a huge improvement to toolkit ergonomics. As a reminder, above was previously

from openff.toolkit.topology import Molecule
from openff.toolkit.typing.engines.smirnoff import ForceField
from openff.units import unit, Quantity

topology = ...

This has me thinking it'd be nice to just condense it further to

from openff.toolkit import Molecule, ForceField, unit

topology = ...

I can't imagine a use case in which a user has openff-toolkit correctly installed and is missing openff-units. Even if it wasn't a part of the conda recipe, the current code requires it for all key functionality.

Since this is just a matter of moving around some imports, there shouldn't be any added stability issues with either package. (In other words, if from openff.units import unit has to break, it'd break the same if it's in openff/toolkit/__init__.py or one of the submodules.)

The module is already imported when importing any of the high-level classes:

In [1]: import sys

In [2]: "openff.units" in sys.modules.keys()
Out[2]: False

In [3]: from openff.toolkit import Molecule, ForceField, Topology

In [4]: "openff.units" in sys.modules.keys()
Out[4]: True

so there should not be an impact on import time or performance.

I can see the argument that this obfuscates where exactly these objects are coming from. I think there's benefit (fewer openff.* modules being throw at new users lessens the cognitive burden) and downside (debugging would involve opening up openff/toolkit/__init__.py to find out that it's in a different castle, after all) to that, and roughly estimate it comes out as a wash.

The other downside I considered is name collisions with openmm.unit, but this is an existing (minor) issue and I don't think this change would really affect it.

I don't see a reason to remove the existing import paths in the openff.units package, so code like

from openfe import SolventComponent, ProteinComponent
from openff.units import unit

shouldn't break (or require updating).

In summary, none of the downsides that come to mind (import times, packaging snafus, API stability) have teeth, and this would condense a lot of code by 1 line.

Describe the solution you'd like

diff --git a/openff/toolkit/__init__.py b/openff/toolkit/__init__.py
index 0f685b78..bd960c57 100644
--- a/openff/toolkit/__init__.py
+++ b/openff/toolkit/__init__.py
@@ -11,6 +11,8 @@ from openff.toolkit._version import get_versions  # type: ignore
 if TYPE_CHECKING:
     # These types are imported lazily at runtime, but we need to tell type
     # checkers what they are
+    from openff.units import Quantity, unit
+
     from openff.toolkit.topology import Molecule, Topology
     from openff.toolkit.typing.engines.smirnoff import (
         ForceField,
@@ -39,6 +41,8 @@ __all__ = [
     "OpenEyeToolkitWrapper",
     "RDKitToolkitWrapper",
     "ToolkitRegistry",
+    "unit",
+    "Quantity",
 ]

 # Dictionary of objects to lazily import; maps the object's name to its module path
@@ -53,6 +57,7 @@ _lazy_imports_obj = {
     "OpenEyeToolkitWrapper": "openff.toolkit.utils.toolkits",
     "RDKitToolkitWrapper": "openff.toolkit.utils.toolkits",
     "ToolkitRegistry": "openff.toolkit.utils.toolkits",
+    "Quantity": "openff.units.units",
     # Remember to add new lazy imports to __all__ and the if TYPE_CHECKING imports
 }

@@ -61,6 +66,7 @@ _lazy_imports_mod = {
     "topology": "openff.toolkit.topology",
     "typing": "openff.toolkit.typing",
     "utils": "openff.toolkit.utils",
+    "unit": "openff.units.units",
 }
In [1]: from openff.toolkit import Quantity, unit

In [2]: unit
Out[2]: <module 'openff.units.units' from '/Users/mattthompson/mambaforge/envs/openff-interchange-env/lib/python3.11/site-packages/openff/units/units.py'>

Describe alternatives you've considered

Keep importing the same stuff from their existing import paths.

Additional context

I've seen @lilyminium at times use the trick from openff.toolkit.topology.molecule import unit (mostly for other reasons) and I really liked it. So much that I stole it and am proposing it here.