Open Erotemic opened 2 months ago
Draft of concept:
def attach(package_name, submodules=None, submod_attrs=None, eager='auto'):
"""Attach lazily loaded submodules, functions, or other attributes.
Typically, modules import submodules and attributes as follows::
import mysubmodule
import anothersubmodule
from .foo import someattr
The idea is to replace a package's `__getattr__`, `__dir__`, and
`__all__`, such that all imports work exactly the way they would
with normal imports, except that the import occurs upon first use.
The typical way to call this function, replacing the above imports, is::
__getattr__, __dir__, __all__ = lazy.attach(
__name__,
['mysubmodule', 'anothersubmodule'],
{'foo': ['someattr']}
)
This functionality requires Python 3.7 or higher.
Parameters
----------
package_name : str
Typically use ``__name__``.
submodules : set
List of submodules to attach.
submod_attrs : dict
Dictionary of submodule -> list of attributes / functions.
These attributes are imported as they are used.
eager : str | bool
if True access all attributes. If "auto" enable this based on the
value of the ``EAGER_IMPORT`` environment variable.
Returns
-------
__getattr__, __dir__, __all__
"""
if submod_attrs is None:
submod_attrs = {}
if submodules is None:
submodules = set()
else:
submodules = set(submodules)
attr_to_modules = {
attr: mod for mod, attrs in submod_attrs.items() for attr in attrs
}
__all__ = sorted(submodules | attr_to_modules.keys())
def __getattr__(name):
if name in submodules:
return importlib.import_module(f"{package_name}.{name}")
elif name in attr_to_modules:
submod_path = f"{package_name}.{attr_to_modules[name]}"
submod = importlib.import_module(submod_path)
attr = getattr(submod, name)
# If the attribute lives in a file (module) with the same
# name as the attribute, ensure that the attribute and *not*
# the module is accessible on the package.
if name == attr_to_modules[name]:
pkg = sys.modules[package_name]
pkg.__dict__[name] = attr
return attr
else:
raise AttributeError(f"No {package_name} attribute {name}")
def __dir__():
return __all__
eager_import_flag = False
if eager == 'auto':
# Enable eager import based on the value of the environ
eager_import_text = os.environ.get('EAGER_IMPORT', '')
if eager_import_text:
eager_import_text_ = eager_import_text.lower()
if eager_import_text_ in {'true', '1', 'on', 'yes'}:
eager_import_flag = True
# Could be more fancy here
if __name__ in eager_import_text_:
eager_import_flag = True
else:
eager_import_flag = eager
if eager_import_flag:
for attr in set(attr_to_modules.keys()) | submodules:
__getattr__(attr)
return __getattr__, __dir__, list(__all__)
The
EAGER_IMPORT
environment variable is a way to test that all of your lazy imports were correctly specified.Every module that uses lazy_loader checks this and attempts to access all of its members if the envion is set.
This is a useful feature to have, but I'm running into an issue where the huggingface library is using lazy_loader, and attempting to test my module with
EAGER_IMPORT
causes failures. I just want to test my module. To avoid this I'm thinking about how the user can specify only a specific module to eager import.Perhaps EAGER_IMPORT could be given as a pattern, and if
__name__
matches the pattern, then it can trigger the case, or if it is a special code like "1", or "True", then it triggers for every module (i.e. implements the existing behavior).Another option is to make separate environs for each module, but casing seems like it might cause confusion.