scientific-python / lazy-loader

Populate library namespace without incurring immediate import costs
BSD 3-Clause "New" or "Revised" License
116 stars 19 forks source link

Add mechanism to differentiate between modules using EAGER_IMPORT #117

Open Erotemic opened 2 months ago

Erotemic commented 2 months ago

The EAGER_IMPORT environment variable is a way to test that all of your lazy imports were correctly specified.

Every module that uses lazy_loader checks this and attempts to access all of its members if the envion is set.

    if os.environ.get("EAGER_IMPORT", ""):
        for attr in set(attr_to_modules.keys()) | submodules:
            __getattr__(attr)

This is a useful feature to have, but I'm running into an issue where the huggingface library is using lazy_loader, and attempting to test my module with EAGER_IMPORT causes failures. I just want to test my module. To avoid this I'm thinking about how the user can specify only a specific module to eager import.

Perhaps EAGER_IMPORT could be given as a pattern, and if __name__ matches the pattern, then it can trigger the case, or if it is a special code like "1", or "True", then it triggers for every module (i.e. implements the existing behavior).

Another option is to make separate environs for each module, but casing seems like it might cause confusion.

Erotemic commented 2 months ago

Draft of concept:

def attach(package_name, submodules=None, submod_attrs=None, eager='auto'):
    """Attach lazily loaded submodules, functions, or other attributes.

    Typically, modules import submodules and attributes as follows::

      import mysubmodule
      import anothersubmodule

      from .foo import someattr

    The idea is to replace a package's `__getattr__`, `__dir__`, and
    `__all__`, such that all imports work exactly the way they would
    with normal imports, except that the import occurs upon first use.

    The typical way to call this function, replacing the above imports, is::

      __getattr__, __dir__, __all__ = lazy.attach(
        __name__,
        ['mysubmodule', 'anothersubmodule'],
        {'foo': ['someattr']}
      )

    This functionality requires Python 3.7 or higher.

    Parameters
    ----------
    package_name : str
        Typically use ``__name__``.
    submodules : set
        List of submodules to attach.
    submod_attrs : dict
        Dictionary of submodule -> list of attributes / functions.
        These attributes are imported as they are used.
    eager : str | bool
        if True access all attributes. If "auto" enable this based on the
        value of the ``EAGER_IMPORT`` environment variable.

    Returns
    -------
    __getattr__, __dir__, __all__

    """
    if submod_attrs is None:
        submod_attrs = {}

    if submodules is None:
        submodules = set()
    else:
        submodules = set(submodules)

    attr_to_modules = {
        attr: mod for mod, attrs in submod_attrs.items() for attr in attrs
    }

    __all__ = sorted(submodules | attr_to_modules.keys())

    def __getattr__(name):
        if name in submodules:
            return importlib.import_module(f"{package_name}.{name}")
        elif name in attr_to_modules:
            submod_path = f"{package_name}.{attr_to_modules[name]}"
            submod = importlib.import_module(submod_path)
            attr = getattr(submod, name)

            # If the attribute lives in a file (module) with the same
            # name as the attribute, ensure that the attribute and *not*
            # the module is accessible on the package.
            if name == attr_to_modules[name]:
                pkg = sys.modules[package_name]
                pkg.__dict__[name] = attr

            return attr
        else:
            raise AttributeError(f"No {package_name} attribute {name}")

    def __dir__():
        return __all__

    eager_import_flag = False
    if eager == 'auto':
        # Enable eager import based on the value of the environ
        eager_import_text = os.environ.get('EAGER_IMPORT', '')
        if eager_import_text:
            eager_import_text_ = eager_import_text.lower()
            if eager_import_text_ in {'true', '1', 'on', 'yes'}:
                eager_import_flag = True
            # Could be more fancy here
            if __name__ in eager_import_text_:
                eager_import_flag = True
    else:
        eager_import_flag = eager

    if eager_import_flag:
        for attr in set(attr_to_modules.keys()) | submodules:
            __getattr__(attr)

    return __getattr__, __dir__, list(__all__)