python / cpython

The Python programming language
https://www.python.org
Other
62.73k stars 30.07k forks source link

zipimport cannot do a namespace import when a directory has no python files, but it contains nested directories with python files inside of a zip file. #121111

Closed AraHaan closed 2 months ago

AraHaan commented 3 months ago

Bug report

Bug description:

When on the filesystem packages like discord.py (pip install discord.py) works just fine (from discord.ext.commands import Bot).

However, when you use the same code used to create the python312.zip file to put that package inside of a zip file (the aiohttp dependency would need to be loaded from the user site-packages folder as it contains c extensions files inside of it's package) then zipimport's _is_dir bugs out and says that it cannot import module discord.ext because it most likely thinks it is a file (it is not as it clearly contains nested directories).

The fix: The simple fix is if the first check returns false, is to do a recursive subdirectory check and see if it has subdirectories of any nesting levels with python files inside (__init__.py[c], etc) and to return true if that is the case.

Temporary solution: Manually create an empty __init__.py when the interpreter sees this sort of issue and contribute it back to the package owner.

While it might sound like a good suggestion at first, some maintainers like to randomly block people from their github from filing issues that they identify as a skill issue so the best fix is the one in zipimport as there is no control over what package authors do, but there are use cases for zipping everything into a single zip file as well (embedding). Also fixing zipimport would be a simple patch once and done chore as well as opposed to filing an issue with every package that a person could place into a zip file and use.

CPython versions tested on:

3.12, 3.13, CPython main branch

Operating systems tested on:

Linux, macOS, Windows

Linked PRs

AraHaan commented 3 months ago

The issue was originally identified in this issue as well: https://github.com/SeaHOH/memimport/issues/3

ZeroIntensity commented 3 months ago

(I'd like to work on this issue, please assign it to me.) It seems that zipimport.zipimporter doesn't like any subdirectory directory that doesn't contain an __init__.py file:

A quick repro:

$ mkdir x
$ touch x/a.py
$ zip -r x.zip x
import zipimport
import importlib.util

x = zipimport.zipimporter("x.zip")
spec = x.find_spec("x")
mod = importlib.util.module_from_spec(spec)
x.exec_module(mod)  # Error!

(As an additional note, typeshed does not have the exec_module method on zipimporter yet. See this issue).

Although, this works just fine when you do it the normal way:

import sys

sys.path.append("./x.zip")
import x
ZeroIntensity commented 3 months ago

Ah, FWIW, I was being dumb - my example here is unable to import a submodule anyway:

import sys

sys.path.append("./x.zip")
import x

(I don't think this is something that could be added, and out of the scope of this issue anyway.) Here's the actual bug:

Using the following commands as setup:

$ mkdir a a/b a/b/c
$ touch a/__init__.py a/b/c/__init__.py
$ zip -r a.zip a
$ rm -rf a

The following works using import,

import sys
sys.path.append("a.zip")

import a.b.c

but the following does not, using zipimport:

import zipimport

importer = zipimport.zipimporter("./a.zip")
importer.get_code("a")
importer.get_code("a.b")  # Error!
AraHaan commented 3 months ago

Ah, FWIW, I was being dumb - my example here is unable to import a submodule anyway:

import sys

sys.path.append("./x.zip")
import x

(I don't think this is something that could be added, and out of the scope of this issue anyway.) Here's the actual bug:

Using the following commands as setup:

$ mkdir a a/b a/b/c
$ touch a/__init__.py a/b/c/__init__.py
$ zip -r a.zip a
$ rm -rf a

The following works using import,

import sys
sys.path.append("a.zip")

import a.b.c

but the following does not, using zipimport:

import zipimport

importer = zipimport.zipimporter("./a.zip")
importer.get_code("a")
importer.get_code("a.b")  # Error!

it is odd that import a.b.c works but on my zip file from a.b.c import <method here> does not.

ZeroIntensity commented 3 months ago

I didn't test that - does that fail? (Note that you need an __init__.py)

SeaHOH commented 3 months ago

Here is how to repro:

#!/usr/bin/env python3
# test.py

import zipfile
with zipfile.ZipFile('a.zip', 'w') as zf:
    zf.writestr('a/__init__.py', b'')
    zf.writestr('a/b/c/__init__.py', b'')

import sys
sys.path.append("a.zip")

import a.b.c
print(a.b)
$ ./test.py
Traceback (most recent call last):
  File "./test.py", line 12, in <module>
    import a.b.c
ModuleNotFoundError: No module named 'a.b'

When add this line in create zip file:

    zf.mkdir('a/b/')  # need py311
$ ./test.py
<module 'a.b' (<_frozen_importlib_external.NamespaceLoader object at 0x0000000000D371D0>)>
serhiy-storchaka commented 3 months ago

I haven't delved too deeply into this yet, but I think the easiest solution is to add new dummy entries to the files set. In _read_directory() or nearby. #121233 implements this. The test provided by @SeaHOH is passed.

AraHaan commented 3 months ago

I haven't delved too deeply into this yet, but I think the easiest solution is to add new dummy entries to the files set. In _read_directory() or nearby. #121233 implements this. The test provided by @SeaHOH is passed.

hate to ask but could this change to backported to 3.12 and 3.13 as well?

serhiy-storchaka commented 3 months ago

It looks rather like a new feature, so no.

There is a simple workaround -- always add explicit directory entries to the ZIP file. zipfile, shutil.make_archive() and zipapp have been doing this for years.

serhiy-storchaka commented 2 months ago

It seems that the original issue has been resolved.

diegorusso commented 2 months ago

Just a quick comment to say that the make test fails on my setup

testNamespacePackageExplicitDirectories (test.test_zipimport.CompressedZipImportTestCase.testNamespacePackageExplicitDirectories) ... ERROR
testNamespacePackageImplicitDirectories (test.test_zipimport.CompressedZipImportTestCase.testNamespacePackageImplicitDirectories) ... ERROR
testNamespacePackageExplicitDirectories (test.test_zipimport.UncompressedZipImportTestCase.testNamespacePackageExplicitDirectories) ... ERROR
testNamespacePackageImplicitDirectories (test.test_zipimport.UncompressedZipImportTestCase.testNamespacePackageImplicitDirectories) ... ERROR

======================================================================
ERROR: testNamespacePackageExplicitDirectories (test.test_zipimport.CompressedZipImportTestCase.testNamespacePackageExplicitDirectories)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/dierus01/work/ce-sw/repos/cpython/Lib/test/test_zipimport.py", line 352, in testNamespacePackageExplicitDirectories
    self._testPackage(initfile=None)
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "/home/dierus01/work/ce-sw/repos/cpython/Lib/test/test_zipimport.py", line 377, in _testPackage
    mod = importlib.import_module(f'a.b')
  File "/home/dierus01/work/ce-sw/repos/cpython/Lib/importlib/__init__.py", line 88, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1319, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'a.b'; 'a' is not a package
....

Did I miss anything?

serhiy-storchaka commented 2 months ago

These tests are passed on my computer and on all buildbots. It seems that something is wrong with your setup. Did you forget to rebuild Python?

diegorusso commented 2 months ago

I blew the build directory and the error went away. Apologies for the confusion.