openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
318 stars 93 forks source link

Always `getattr()` in lazy import machinery #1963

Open Yoshanuikabundi opened 2 weeks ago

Yoshanuikabundi commented 2 weeks ago

@lilyminium 's fix in #1961 was exactly right:

- return mod.__dict__[name]
+ try:
+     return mod.__dict__[name]
+ except KeyError:  # account for lazy loaders
+     return getattr(mod, name)

My only tweak is that I don't think looking in __dict__ is now ever useful. I originally did this out of an abundance of caution to avoid infinite recursion where __getattr__ is calling itself via getattr. Having reviewed it now I'm convinced that this recursive case can't actually happen. In any case, the current approach of trying __dict__ and falling back to getattr does not protect against this recursion error, so this PR simplifies the code and adds an explanatory comment.

The remainder of this description just gives context on why this recursion shouldn't happen. It's summarized in a safety comment in the code itself. I may or may not have been burned by indirect recursion before so I just wanted to document this.


This infinite recursion can happen if an object that doesn't exist in openff.toolkit is included in the _lazy_imports_obj dictionary as though it was in that module:

# openff/toolkit/__init__.py

_lazy_imports_obj = {
    ...
    "foo": "openff.toolkit", # But we're already in `openff.toolkit`, and `foo` does not exist
}

# user code
from openff.toolkit import foo # RecursionError: maximum recursion depth exceeded

It doesn't happen for objects that are simply missing from _lazy_imports_obj because the name is not found in the _lazy_imports_obj dictionary, and it doesn't happen for objects in the current module that are present in _lazy_imports_obj, because these objects are found in the usual Python way and never make it to the custom __getattr__. It doesn't happen for objects from other modules because it's not recursive in this case - instead of openff.toolkit.__getattr__ calling getattr calling openff.toolkit.__getattr__, getattr calls some_other_module.__getattr__.

This error shouldn't happen for two reasons: there's no reason to include an object from the current module in _lazy_imports_obj, and there's no reason to include an object that doesn't exist. For it to happen, we'd have to make a release that tried to lazy load an object that doesn't exist from the one module that we know is already loaded, and users would have to try to load that exact name (which, again, doesn't exist). If this unusual circumstance happened with the original code, it would fail with the usual Python message about not being able to import something. With the new code in this PR or the existing code, it's a recursion error instead, but it still fails. Since the worst case scenario is just a slightly different error message and it requires a coordinated mistake between users and us, I think this is an acceptable risk.

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 93.74%. Comparing base (8aa96bc) to head (0705bee).

Additional details and impacted files