pypa / bandersnatch

A PyPI mirror client according to PEP 381 http://www.python.org/dev/peps/pep-0381/
Academic Free License v3.0
454 stars 141 forks source link

adding a new package to whitelist section does not work #97

Closed GreatBahram closed 5 years ago

GreatBahram commented 5 years ago

First of all, thanks for white list plugin @dwighthubbard and other friends. I had a problem with whitelist plugin, consider this config file for exmaple:

[mirror]
directory = /srv/pypi
json = false
master = https://pypi.org
timeout = 10
workers = 8
hash-index = false
stop-on-error = false
verifiers = 3

[whitelist]
packages =
    bs4
    requests
    click

After bandersnatch downloaded all related files, then I want to add another modules, but this time bandersnatch cannot get the new packages I think there is problem with target_serial attribute. The output I got:

2018-11-28 14:42:13,353 INFO: bandersnatch/3.1.1 (cpython 3.6.7-final0, Linux x86_64)
2018-11-28 14:42:13,354 INFO: Syncing with https://pypi.org.
2018-11-28 14:42:13,354 INFO: Current mirror serial: 4538127
2018-11-28 14:42:13,354 INFO: Syncing based on changelog.
2018-11-28 14:42:14,395 DEBUG: Project blacklist is []
2018-11-28 14:42:14,396 DEBUG: Initialized project plugin 'blacklist_project', filtering []
2018-11-28 14:42:14,399 DEBUG: Project whitelist is ['django', 'bs4', 'click', 'pytest', 'aiohttp', 'requests']
2018-11-28 14:42:14,399 INFO: Initialized project plugin 'whitelist_project', filtering ['django', 'bs4', 'click', 'pytest', 'aiohttp', 'requests']                                                                              
2018-11-28 14:42:14,401 INFO: Trying to reach serial: 4538140
2018-11-28 14:42:14,401 INFO: 0 packages to sync.
2018-11-28 14:42:14,402 DEBUG: Starting to sync packages 8 at once
2018-11-28 14:42:14,402 ERROR: Problem with package syncs: []
2018-11-28 14:42:14,402 INFO: Generating global index page.
2018-11-28 14:42:14,403 INFO: New mirror serial: 4538140
2018-11-28 14:42:14,404 INFO: 0 packages had changes

Content of pypi directory:

.
├── generation
├── output
├── status
└── web
    ├── last-modified
    ├── local-stats
    │   └── days
    ├── packages
    │   ├── 00
     .................
    │   └── ff
    └── simple
        ├── bs4
        ├── click
        ├── index.html
        └── requests

182 directories, 5 files

What's the problem ? and is there any way to monitor current status of bandersnatch, like how many packages has been download up to now and so on?

cooperlees commented 5 years ago

Bandersnatch uses a serial in order to ask PyPI what's change since it's last successful sync. When you add a new package your need to remove the status file in order to ask for change since the inception of PyPI serials.

To do this:

# This is using your directory in your config file
mv -v /srv/pypi/status* /tmp

Then restart a bandersnatch sync and it will then sync your "newly" added packages.

We should:

Other ideas?

GreatBahram commented 5 years ago

In my opinion, doing that manually is a little bit dirty work, isn't it? . Adding an option is a better choice, this can be done in two ways:

  1. Moving the status file before entering the main.mirror function.
  2. Set the mirror.synced_serial attribute to zero (this one sounds like a mock up ...) inside main.mirror function.

I tested the first one, the only problem I had was its logging message appears before anything else and that can be solved by setting the logging level to debug.

GreatBahram commented 5 years ago

Check PR #102, is that OK?

cooperlees commented 5 years ago

cli arg to make this easier being pushed in 3.1.2