pypa / bandersnatch

A PyPI mirror client according to PEP 381 http://www.python.org/dev/peps/pep-0381/
Academic Free License v3.0
455 stars 141 forks source link

Configuration and Filtering Help #1282

Closed CuckooEXE closed 1 year ago

CuckooEXE commented 1 year ago

Hi all! I'm trying to create a PyPI mirror for offline usage. For what I need, I don't want any of the really large packages (i.e. the AI/ML stuff). However, I'm having a hard time figuring out the correct way to filter that out. So a few questions:

  1. Is the following configuration section (I found in another issue), explicitly allowing or forbidding the regex patterns?
[regex_project_metadata]
none:match-null:info.name =
  ^tf
  ^mxnet
  ^tensorflow
  ^cupy
  \-nightly$
  ^lalsuite
  ^cntk
  ^catboost
  ^openvisus
  ^paddlepaddle
  ^torch
  ^grpcio
  ^codeintel
  ^CodeIntel
  ^opencv
  ^fiona
  ^sickrage
  1. How do I filter against this tag? Topic :: Scientific/Engineering :: Artificial Intelligence
  2. What is the todo file?

Thanks! :)

cooperlees commented 1 year ago

1) The docs say it's an allow list: https://bandersnatch.readthedocs.io/en/latest/filtering_configuration.html#project-regex-matching

2) Since this seems to be an allow list, you would need to match all other classifiers other than Scientific ... Unless you want to try and allow the plugin to be a deny list - PR would be welcome.

3) todo means you didn't complete a sync and next bandersnatch run will retry to download them before calculating differences again to pull

Sorry I am not more help. I don't personally run bandersnatch anywhere, but am a sole maintainer :(

CuckooEXE commented 1 year ago

Ahh, I see now. I didn't catch that part of the documentation. Thank you for explaining!