pypa / bandersnatch

A PyPI mirror client according to PEP 381 http://www.python.org/dev/peps/pep-0381/
Academic Free License v3.0
447 stars 141 forks source link

'exclude_platform' plugin causes a KeyError in package metadata verify #505

Open alebourdoulous opened 4 years ago

alebourdoulous commented 4 years ago

Hi,

It fail to verify package when plugin "exclude_platform" is enable. If i remove this plugin, it works fine.

root@49c4e2af68b5:/mirror/data/pypi.org# bandersnatch  verify
2020-05-06 10:09:36,156 INFO: Starting verify for /mirror/data/pypi.org with 10 workers
2020-05-06 10:09:36,162 INFO: Parsing backtesting
2020-05-06 10:09:36,289 INFO: Initialized exclude_platform plugin with ['.win32', '-win32', 'win_amd64', 'win-amd64', 'macosx_', 'macosx-', '.freebsd', '-freebsd']
Traceback (most recent call last):
  File "/usr/local/bin/bandersnatch", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/bandersnatch/main.py", line 177, in main
    return loop.run_until_complete(async_main(args, config))
  File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.8/dist-packages/bandersnatch/main.py", line 98, in async_main
    return await bandersnatch.verify.metadata_verify(config, args)
  File "/usr/local/lib/python3.8/dist-packages/bandersnatch/verify.py", line 235, in metadata_verify
    await async_verify(
  File "/usr/local/lib/python3.8/dist-packages/bandersnatch/verify.py", line 210, in async_verify
    await asyncio.gather(*consumers)
  File "/usr/local/lib/python3.8/dist-packages/bandersnatch/verify.py", line 201, in consume
    await verify(
  File "/usr/local/lib/python3.8/dist-packages/bandersnatch/verify.py", line 126, in verify
    plugin.filter(pkg["info"])
  File "/usr/local/lib/python3.8/dist-packages/bandersnatch_filter_plugins/filename_name.py", line 81, in filter
    releases = metadata["releases"]
KeyError: 'releases'
root@49c4e2af68b5:/mirror/data/pypi.org# 

I have also a question, is it possible to skip downloading of all wheel binary (extention whl) ?

Alain

cooperlees commented 4 years ago

Thanks for reporting the bug. Can I grab a copy of your bandersnatch.conf please?

As for skipping all wheels - I think via the release File Regex matching would be your best bet: https://bandersnatch.readthedocs.io/en/latest/filtering_configuration.html#release-file-regex-matching

alebourdoulous commented 4 years ago

Hi Cooper,

Here is the requested file

root@bd682f7432c1:/# cat /etc/bandersnatch.conf 
[mirror]
; The directory where the mirror data will be stored.
directory = /mirror/data/pypi.org

json = true

; Cleanup legacy non PEP 503 normalized named simple directories
cleanup = false

; The PyPI server which will be mirrored.
; master = https://testpypi.python.org
; scheme for PyPI server MUST be https
master = https://pypi.org

; The network socket timeout to use for all connections. This is set to a
; somewhat aggressively low value: rather fail quickly temporarily and re-run
; the client soon instead of having a process hang infinitely and have TCP not
; catching up for ages.
timeout = 300

; Number of worker threads to use for parallel downloads.
; Recommendations for worker thread setting:
; - leave the default of 3 to avoid overloading the pypi master
; - official servers located in data centers could run 10 workers
; - anything beyond 10 is probably unreasonable and avoided by bandersnatch
workers = 6

; Whether to hash package indexes
; Note that package index directory hashing is incompatible with pip, and so
; this should only be used in an environment where it is behind an application
; that can translate URIs to filesystem locations.  For example, with the
; following Apache RewriteRule:
;     RewriteRule ^([^/])([^/]*)/$ /mirror/pypi/web/simple/$1/$1$2/
;     RewriteRule ^([^/])([^/]*)/([^/]+)$/ /mirror/pypi/web/simple/$1/$1$2/$3
; Setting this to true would put the package 'abc' index in simple/a/abc.
; Recommended setting: the default of false for full pip/pypi compatability.
hash-index = false

; Whether to stop a sync quickly after an error is found or whether to continue
; syncing but not marking the sync as successful. Value should be "true" or
; "false".
stop-on-error = false

; Whether or not files that have been deleted on the master should be deleted
; on the mirror, too.
; IMPORTANT: if you are running an official mirror than you *need* to leave
; this on.
delete-packages = true

; Advanced logging configuration. Uncomment and set to the location of a
; python logging format logging config file.
; log-config = /etc/bandersnatch-log.conf

[statistics]
; A glob pattern matching all access log files that should be processed to
; generate daily access statistics that will be aggregated on the master PyPI.
access-log-pattern = ${MIRROR_LOG}/*.pypi.python.org*access*

[plugins]
enabled =
    blacklist_project
    exclude_platform
    regex_project

[blacklist]
packages =
    example1
    catboost-dev
    lalsuite
    paddlepaddle-gpu
    OpenVisus
    frida
    mxnet
    tf-nightly
    tf_nightly

platforms =
    windows
    macos
    freebsd

[filter_regex]
packages =
    mxnet+.
    .+mxnet+.
    .+mxnet
    tf-nightly+.
    tf_nightly+.
    cupy-cuda+.
    tensorflow+.
    .+tensorflow+.
    lalsuite+.
    catboost+.
    .+cuda

; vim: set ft=cfg:

thank's