Ouranosinc / xclim

Library of derived climate variables, ie climate indicators, based on xarray.
https://xclim.readthedocs.io/en/stable/
Apache License 2.0
327 stars 58 forks source link

Speed up import #1948

Open huard opened 1 day ago

huard commented 1 day ago

Addressing a Problem?

Import takes 2.5s on my laptop.

Benchmark using python -X importtime test.py where test.py is just import xclim

Potential Solution

For reference, here are import times for some of our dependencies. Note that these numbers are only valid in the xclim context, you'd get different results by testing them individually, since they import each other.

Additional context

Code for lazy import (https://docs.python.org/3/library/importlib.html#implementing-lazy-imports)

import importlib.util
import sys
def lazy_import(name):
    spec = importlib.util.find_spec(name)
    loader = importlib.util.LazyLoader(spec.loader)
    spec.loader = loader
    module = importlib.util.module_from_spec(spec)
    sys.modules[name] = module
    loader.exec_module(module)
    return module

Note that if we lazy import indicators, then they're not in the xclim registry. So the virtual module creation, which relies on the registry, would need to trigger their import.

Contribution

Code of Conduct

aulemahal commented 1 day ago

I ran the same tests and piped it through tuna, like I did in #1135 and here's a snapshot:

image

I fear that most time is not lost by loading indicators. xclim.indices shows up at the top only because of the order of operations. The longest-loading submodule seems to be in the fire indicators, and that might be numba jitting functions eagerly rather than lazily. Some gain could be made there.

huard commented 12 hours ago

Regarding the load time of indices, what I did is I commented from indices import * in the __init__ and commented another side import of indices elsewhere in indicators.py. I computed the difference between the import time in this scenario and the base scenario.