banesullivan / scooby

🐶 🕵️ Great Dane turned Python environment detective
MIT License
47 stars 12 forks source link

Improve load time #85

Closed banesullivan closed 2 years ago

banesullivan commented 2 years ago

Helps #79 to speed up scooby import time since distutils is imported for a lesser used method get_standard_lib_modules()

We can remove/deprecate in a follow-up PR. This change speeds up the import time from 0.378s to 0.044s for me

cc @prisae

banesullivan commented 2 years ago

I lied. This doesn't do anything after merging with main .... will fix

banesullivan commented 2 years ago

@prisae, some of the logic with psutil and mkl in knowledge.py also slows down the import a good bit but not nearly as much

prisae commented 2 years ago

Thanks @banesullivan for jumping quick on this! Yes, now that distutils is gone the picture looks very different. 2022-07-22-02

Zooming in to scooby: 2022-07-22-03

I pushed to also lazy-import multiprocessing, resulting in 2022-07-22-04

Overall, this is a massive difference. The big pieces are now:

These changes would be already very small compared to the changed achieved so far. I wonder if we should do them (hide all the imports in the functions), or leave it as is for know. It might be worth holding on for a while, there are "things in the works" I think:

codecov-commenter commented 2 years ago

Codecov Report

Merging #85 (7d62f3f) into main (999d9ca) will increase coverage by 0.08%. The diff coverage is 79.45%.

@@            Coverage Diff             @@
##             main      #85      +/-   ##
==========================================
+ Coverage   87.15%   87.23%   +0.08%     
==========================================
  Files           4        4              
  Lines         366      384      +18     
==========================================
+ Hits          319      335      +16     
- Misses         47       49       +2     
prisae commented 2 years ago

Okay, I also made psutil, mkl and numexpr a lazy load. It is not necessarily "beautiful", but effective. Maybe we should revisit these things if lazy-loading becomes more embedded within Python itself.

The scooby load time is now down to 22% of the whole load process, so 3/4 are out of our hands, which I think is more than sufficient. The speed-up of the pure scooby-share we achieved with this is almost 20x (from ~0.288 s to 0.015s)! 2022-07-22-05

prisae commented 2 years ago

This brings an interesting question. Should we add a benchmark testing load time, so we do not accidentality introduce a slow load at a later point?

Not sure how to do that exactly. Here a very ugly way, probably very error prone, but at least an idea ;-) It asserts that the import time is less than 0.1 seconds (maybe we could aim at 0.05 s).

import subprocess

out = subprocess.run(
    ["python", "-X", "importtime", "-c", "import scooby"], capture_output=True
)
itime = int(out.stderr.decode("utf-8").split("\n")[-2].split('|')[1].strip())
assert itime < 100_000  # maybe even 50_000?

Maybe a bit better (although not sure if cross-platform)


import subprocess

out = subprocess.run(
    ["time", "-f", "%U", "python", "-c", "import scooby"], capture_output=True
)
assert float(out.stderr.decode("utf-8")[:-1]) < 0.1
prisae commented 2 years ago

It is fantastic, thanks @banesullivan - in my application the load time spent on scooby went from 0.077s (5%!) to 0.001s (irrelevant), which is fantastic!

(Note the difference, not from 0.288s to 0.015s as reported here; I assume when importing a package together with many other there are some shared benefits, so importing scooby alone is different than importing scooby within another package.)

prisae commented 2 years ago

I just realized that your solution here @banesullivan will get into merge conflict with the solution of @akaszynski in #83 - can we get first #83 in?

akaszynski commented 2 years ago

I just realized that your solution here @banesullivan will get into merge conflict with the solution of @akaszynski in #83 - can we get first #83 in?

Give me a minute.

akaszynski commented 2 years ago

@prisae, I'll leave it to you to resolve the conflicts. Consider this approved.

akaszynski commented 2 years ago

This brings an interesting question. Should we add a benchmark testing load time, so we do not accidentality introduce a slow load at a later point?

Not sure how to do that exactly. Here a very ugly way, probably very error prone, but at least an idea ;-) It asserts that the import time is less than 0.1 seconds (maybe we could aim at 0.05 s).

import subprocess

out = subprocess.run(
    ["python", "-X", "importtime", "-c", "import scooby"], capture_output=True
)
itime = int(out.stderr.decode("utf-8").split("\n")[-2].split('|')[1].strip())
assert itime < 100_000  # maybe even 50_000?

Maybe a bit better (although not sure if cross-platform)

import subprocess

out = subprocess.run(
    ["time", "-f", "%U", "python", "-c", "import scooby"], capture_output=True
)
assert float(out.stderr.decode("utf-8")[:-1]) < 0.1

Enforcement of execution times is a bit hard, especially on CI where we have no control over the hardware.

I honestly wish we could record the total number of cpu instructions rather than execution time. That's (probably) less variable.


Regardless, I see no reason why you couldn't warn for execution times. Just use CI/CD as the worst case.

prisae commented 2 years ago

Good point @akaszynski - but this was indeed my main idea. Just to have a simply test that would warn you if you add by accident a new dependency that takes, say 1s to load. There should go a flag up somewhere.

akaszynski commented 2 years ago

Good point @akaszynski - but this was indeed my main idea. Just to have a simply test that would warn you if you add by accident a new dependency that takes, say 1s to load. There should go a flag up somewhere.

1s is worse than even the original implementation. I'd say tack on 50% to the import time on our CI/CD with this implementation and then warn if we exceed.

prisae commented 2 years ago

Down to roughly 3 ms - I think that is good, given that scooby should be a minimalistic library without footprint/burden on the importing package!

2022-07-23-01

prisae commented 2 years ago

I will add now a small test and double-check everything before merging.

prisae commented 2 years ago

OK, CI/CD is quite slower than local. Imports took between 0.08s and 0.11s; I set the test for now to 0.15s. That should hopefully work as red flag. If it starts to fail we have to either increase the limit or search for the culprit.

prisae commented 2 years ago

Thanks both for the feedback! Would anyone mind making a release (@banesullivan, @akaszynski) - or tell me, what I would have to do for a release? I don't think I have ever done one for scooby.

akaszynski commented 2 years ago

Thanks both for the feedback! Would anyone mind making a release (@banesullivan, @akaszynski) - or tell me, what I would have to do for a release? I don't think I have ever done one for scooby.

I can do it. We don't actually document our release approach, but you can glance over https://github.com/pyvista/pyvista/blob/main/CONTRIBUTING.rst to see how we do it for PyVista. I'm just going to implement that.

@banesullivan, I hope you're fine with this. I never really liked having a non-dev version on main.

Erotemic commented 2 years ago

FWIW because this is Python 3.7+ Using module level __getattr__ would have also been a valid way to do module lazy loading.

prisae commented 2 years ago

Thanks @Erotemic! Would you mind giving an example how one could use __getattr__ in that context?

banesullivan commented 2 years ago

Perhaps something like this? https://anvil.works/blog/lazy-modules

If so, (I also did not read that in detail) I don't see why you'd do this... having from package import module_or_func nested in a function for lazy loading seems more than sufficient to me.

Erotemic commented 2 years ago

@prisae

def __getattr__(key):
    if key == 'platform':
        import platform
        return platform
    else:
        raise AttributeError(key)

Including the above code in the module would allow you to just access platform as an attribute.

There is good discussion of how this can be used more broadly in a scikit-image PR.

@banesullivan I think the current way is also sufficient. It just looks weird to me to include the () at the end of what looks like a module name, which is why I brought it up. Either solution is valid. Contrary to popular belief, there does not need to be only one obvious way to do something.

prisae commented 2 years ago

That is interesting @Erotemic and good to know - A lot seems to go on at the moment with regards to lazy loading modules in the scientific python stack. I also like the way that SciPy will load newly all submodules only "on-demand", but do not have to be imported explicitly any longer.

Erotemic commented 2 years ago

FYI: I have a package mkinit that makes it really easy to expose your entire top-level API via lazy or explicit imports. It works by statically parsing your code and then autogenerating boilerplate for __init__.py. The lazy variant for a package looks like this:

    def lazy_import(module_name, submodules, submod_attrs):
        """
        Boilerplate to define PEP 562 __getattr__ for lazy import
        https://www.python.org/dev/peps/pep-0562/
        """
        import importlib
        import os
        name_to_submod = {
            func: mod for mod, funcs in submod_attrs.items()
            for func in funcs
        }

        def __getattr__(name):
            if name in submodules:
                attr = importlib.import_module(
                    '{module_name}.{name}'.format(
                        module_name=module_name, name=name)
                )
            elif name in name_to_submod:
                submodname = name_to_submod[name]
                module = importlib.import_module(
                    '{module_name}.{submodname}'.format(
                        module_name=module_name, submodname=submodname)
                )
                attr = getattr(module, name)
            else:
                raise AttributeError(
                    'No {module_name} attribute {name}'.format(
                        module_name=module_name, name=name))
            globals()[name] = attr
            return attr

        if os.environ.get('EAGER_IMPORT', ''):
            for name in submodules:
                __getattr__(name)

            for attrs in submod_attrs.values():
                for attr in attrs:
                    __getattr__(attr)
        return __getattr__

    __getattr__ = lazy_import(
        __name__,
        submodules={
            'submod',
            'subpkg',
        },
        submod_attrs={
            'submod': [
                'submod_func',
            ],
            'subpkg': [
                'nested',
                'nested_func',
            ],
        },
    )

    def __dir__():
        return __all__

    __all__ = ['nested', 'nested_func', 'submod', 'submod_func', 'subpkg']

Which is the lazy version of:

    from mkinit_demo_pkg import submod
    from mkinit_demo_pkg import subpkg

    from mkinit_demo_pkg.submod import (submod_func,)
    from mkinit_demo_pkg.subpkg import (nested, nested_func,)

    __all__ = ['nested', 'nested_func', 'submod', 'submod_func', 'subpkg']

Now you could maintain this yourself. Or you could just run mkinit mkinit_demo_pkg --recursive --lazy --write to regenerate it if there is an API change.