pypa / bandersnatch

A PyPI mirror client according to PEP 381 http://www.python.org/dev/peps/pep-0381/
Academic Free License v3.0
448 stars 141 forks source link

ImportError on S3DirEntry from 'mirror' operation #1683

Open alexander-bauer opened 6 months ago

alexander-bauer commented 6 months ago

Hello, I'm using a fresh install of bandersnatch[s3] in attempt to establish a private S3-backed mirror. I discovered this issue in Python 3.9, but was able to reproduce it on Python 3.11. Here is an example configuration, and the following stack trace.

[mirror]
master = https://pypi.org
storage-backend = s3
directory = /my-s3-bucket/
diff-file = bandersnatch-diff
diff-append-epoch = true

json = false
stop-on-error = true
timeout = 30
keep_index_versions = 3
workers = 4

[plugins]
enabled =
  allowlist_project
  allowlist_release
  blocklist_project
  blocklist_release
  project_requirements
  project_requirements_pinned
  exclude_platform

[blocklist]
platforms =
  windows
  macos
  freebsd

[allowlist]
packages =
requirements_path = ./
requirements = *.txt
# bandersnatch -c bandersnatch.conf mirror
2024-03-15 15:50:23,977 INFO: Selected storage backend: s3 (configuration.py:133)
2024-03-15 15:50:23,977 INFO: Selected compare method: hash (configuration.py:181)
Traceback (most recent call last):
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch_storage_plugins/s3.py", line 26, in <module>
    from s3path import S3DirEntry as _S3DirEntry
ImportError: cannot import name 'S3DirEntry' from 's3path' (/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/s3path/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/bin/bandersnatch", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/main.py", line 225, in main
    return asyncio.run(async_main(args, config))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/main.py", line 190, in async_main
    return await bandersnatch.mirror.mirror(config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/mirror.py", line 925, in mirror
    storage_backend_plugins(
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/storage.py", line 403, in storage_backend_plugins
    return load_storage_plugins(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/storage.py", line 370, in load_storage_plugins
    plugin_class = entry_point.load()
                   ^^^^^^^^^^^^^^^^^^
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/pkg_resources/__init__.py", line 2471, in load
    return self.resolve()
           ^^^^^^^^^^^^^^
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/pkg_resources/__init__.py", line 2477, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch_storage_plugins/s3.py", line 29, in <module>
    from s3path import _S3DirEntry
ImportError: cannot import name '_S3DirEntry' from 's3path' (/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/s3path/__init__.py)

It appears the commit in the upstream s3path package is https://github.com/liormizr/s3path/commit/5ea0bd23db60c6efd34da732534442ffb8894abf. It looks like this change first appeared in version 0.5.0, and that the latest tag without it is 0.4.2.

A proper fix should probably update the Bandersnatch codebase to use the new public API, but a minimal fix in the meantime would be to adjust requirements_s3.txt to use s3path==0.4.2 rather than its current s3path==0.5.0.

cooperlees commented 6 months ago

Interesting, we don't support / test Python 3.9 to start with. We may have used more recent python syntax. I would have expected you would have had to hack things to get it to install in 3.9? If not, that's a bug too.

That aside, I would have expected our CI to catch this on the PR that updated to 0.5.0. Happy to revert for now.

I even more welcome a PR to upgrade to latest APIs + plugging the missing testing. I started locally on https://github.com/pypa/bandersnatch/pull/1672 but just haven't had the time to finish it and test it ... all help welcome.

alexander-bauer commented 6 months ago

I didn't have any trouble installing on 3.9, funnily enough. I was able to get my environment to work by pinning the s3path version: the extra requirement listed in the setup.cfg looks like it's just set to >= 0.4.0, so pip install bandersnatch[s3] s3path<0.5.0 is resolvable.

I haven't got the cycles to contribute today, unfortunately, but I'm back up and working with just that install tweak.