GrahamDumpleton / wrapt

A Python module for decorators, wrappers and monkey patching.
BSD 2-Clause "Simplified" License
2.03k stars 231 forks source link

AttributeError when using an import hook for module run directly with `python -m` #238

Closed lazorchakp closed 1 year ago

lazorchakp commented 1 year ago

I believe I've found a bug in wrapt's _ImportHookChainedLoader - here is the reproduction case:

# ./sitecustomize.py
import wrapt

@wrapt.when_imported("lib")
def notify_lib_imported(_):
    print("detected lib import!")
# ./lib.py
print("in lib.py")

This results in the following:

Traceback (most recent call last):
  File "/Users/peterlazorchak/.pyenv/versions/3.10.11/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/Users/peterlazorchak/.pyenv/versions/3.10.11/lib/python3.10/runpy.py", line 157, in _get_module_details
    code = loader.get_code(mod_name)
AttributeError: '_ImportHookChainedLoader' object has no attribute 'get_code'

I suspect this could be solved with something like this:

class _ImportHookChainedLoader:

    def __init__(self, loader):
        self.loader = loader

        if hasattr(loader, "load_module"):
          self.load_module = self._load_module
        if hasattr(loader, "create_module"):
          self.create_module = self._create_module
        if hasattr(loader, "exec_module"):
          self.exec_module = self._exec_module

        # new code
        if hasattr(loader, "get_code"):
          self.get_code = loader.get_code

That being said, I don't understand this part of the code very well, so maybe my solution is too simple. Any input you have would be greatly appreciated. Thanks!

GrahamDumpleton commented 1 year ago

More than that may be required. Seems there are a number of abstract base class (ABC) definitions that these loaders could technically implement and there are more methods that just that one. Eg.,

The capabilities represented by the ABC's are meant to be optional though, so presumably anything wanting to rely on them should be checking that a loader is such an ABC and only then trying to access the functionality. It seems runpy.py may not be making such checks because it existed before the ABC's were added, or because it can't work without the method either way.

So I will need to dig into what they are about and whether it is necessary to detect whether the wrapper loader implements a particular ABC and in that case proxy all the methods of that ABC.

drdavella commented 1 year ago

I've been following this issue as well (I happen to know @lazorchakp 😄) and I'm curious whether it's potentially as simple as making _ImportHookChainedLoader a subclass of ObjectProxy? I haven't thought it through all the way so maybe that doesn't make sense.

GrahamDumpleton commented 1 year ago

That may work, had never considered it before. I will need to think it through and refresh my understanding of what the loader is actually doing.

GrahamDumpleton commented 1 year ago

After a bit of testing it seems ObjectProxy might be able to be used as base class to _ImportHookChainedLoader however there is one complication. In the existing code of __init__() there was:

        if hasattr(loader, "load_module"):
          self.load_module = self._load_module
        if hasattr(loader, "create_module"):
          self.create_module = self._create_module
        if hasattr(loader, "exec_module"):
          self.exec_module = self._exec_module

This is to deal with the migration over time from using load_module() to create_module()/exec_module(). It is possible that loaders may not implement the newer methods still. Anyway, what it means is that I can't just blindly proxy all three methods as this may break code which only implements the older mechanism as anything using the loader will see the proxy functions for the latter and think the newer mechanism is implemented, resulting in a failure when it finds the wrapped class doesn't.

Unfortunately optionally adding proxy functions which do extra work only when wrapped function exists isn't really possible from memory with the way things are. Thus I would need to detect whether a loader implements old or new mechanism in function which creates _ImportHookChainedLoader instead and create different variants of the loader proxy object.

Anyway, will need to think about it some more to see if there is a better way.

GrahamDumpleton commented 1 year ago

I found a solution, but it only works if using pure Python implementation of wrapt. CPython throws an error when using C extension.

The error is:

Error while finding module specification for 'lib' (TypeError: can't apply this __setattr__ to _ImportHookChainedLoader object)

Am really confused about what is the trigger for this. Some interesting issues related to that error are:

Will try and spend some more time looking at it at a later time.

GrahamDumpleton commented 1 year ago

Have a workaround for the problem was having, even if don't understand the real cause. So solution I have will allow me to optionally add the proxy functions still, thus meaning can use ObjectProxy as base class. I need to clean up my changes and then think through what problems the changes may cause. :-)

drdavella commented 1 year ago

Thanks for the updates! I haven't dug into the extension implementation but from a quick glance at those Python bug reports, is it possible that something needs to inherit type?

Also this is a pretty minor thing and I should have asked earlier, but is it possible that _ImportHookChainedLoader should become part of the public API?

GrahamDumpleton commented 1 year ago

In the case of SQLAlchemy, the issue came up for them when using meta classes, so deriving from type was more correct anyway. This is not the case in wrapt as where it uses a meta type for the pure Python implementation, it already derives the meta class from type. So wrapt usage is different and I have no idea why it triggers the problem. As I said though, I found a way around it even if I don't understand the original cause.

As to making _ImportHookChainedLoader part of the public API, what would be the use case for doing that? I can't see why anyone would want to use it directly.

GrahamDumpleton commented 1 year ago

FWIW, been rather busy the last few weeks, but so I at least don't loose the test code I did, I have pushed to develop branch the changes to make the import hook loader class use ObjectProxy as a base. Although I note I made a typo in something so need to fix that now.

Anyway, it seems to pass the bundled tests, but they though don't necessarily exercise the changes very well, so if you are able to try with develop trunk out of GitHub directly, let me know how it goes.

lazorchakp commented 1 year ago

Thank you so much for the updates. I was just able to test this with your latest commit to develop (f2ad8e2) and it appears to have fixed the AttributeError in both the minimal example from this issue and my larger project.

Something interesting I've noticed, however, is that the import hook does not actually fire. To be clear, using the original example in this issue, when I run PYTHONPATH=. python -m lib, I would expect the following output:

in lib.py
detected lib import!

But I'm actually seeing

in lib.py

If I enter a repl with PYTHONPATH=. python and >>> import lib, I get the expected output.

To be honest, this isn't particularly critical for my current project, but it seems like unexpected / unintended behavior for wrapt in general. If it will require too much effort on your end to address, I'm happy to close this issue since the current fix does prevent the AttributeError.

GrahamDumpleton commented 1 year ago

The -m option is a front end for the runpy module. When runpy imports the supposed module to run, it actually bypasses the normal top level Python module import loading mechanism and does strange stuff to import the module you specify by accessing the loaders directly. I am not sure why it does that and what exactly it does that results in it bypassing the wrapt loader, but lib never ends up in sys.modules.

Worth noting, if lib.py imported sub module defined in sub.py in same directory, you can attach an import hook to sub, you just can't do it for the initial lib.

So no idea really and not sure if it is worth it to try and make that specific corner case work.

lazorchakp commented 1 year ago

Sounds good - that is strange. Thanks for investigating this 👍