pulp / pulp_python

A Pulp plugin to support Python packages
GNU General Public License v2.0
37 stars 76 forks source link

Packages containing a dot in the name are lost during Pulp to Pulp syncs #716

Closed quba42 closed 1 month ago

quba42 commented 1 month ago

Version

Describe the bug

When synchronizing a python repo from one Pulp instance to another, any Python packages containing a dot . in the name are quietly dropped during sync. The sync itself succeeds.

To Reproduce

  1. Create a remote with URL: https://pypi.org/ and Includes: oslo.utils
  2. Sync, publish and distribute. You can verify that various versions of oslo.utils are present in the repo.
  3. Now sync the distributed repo to a second pulp instance (do not set the includes field, just synchronize everything).
  4. Sync succeeds, but results in an empty python repo.

Expected behavior

I have no idea as to the wisdom of having a dot . in the name of a Python project (probably a bad Idea!), but since such names exist on PyPI, and can be synced/published to Pulp, I would expect them to also sync from one Pulp instance to another!

Additional context

I can provide log output from the relevant Pulp to Pulp sync:

Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulpcore.tasking.tasks:INFO: Starting task 019121c7-1a12-7c98-8034-c3d4fdc85c86
Aug 05 09:03:02 test-proxy-release pulpcore-api[41162]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]:  - - [05/Aug/2024:09:03:02 +0000] "GET /pulp/api/v3/tasks/019121c7-1a12-7c98-8034-c3d4fdc85c86/ HTTP/1.1" 200 770 "-" "OpenAPI-Generator/3.39.11/ruby"
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch:INFO: Initialized release plugin blocklist_release, filtering []
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch:INFO: Initialized prerelease plugin with [re.compile('.+rc\\d+$'), re.compile('.+a(lpha)?\\d+$'), re.compile('.+b(eta)?\\d+$'), re.compile('.+dev\\d+$')]
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch:INFO: Initialized prerelease plugin prerelease_release, filtering all packages
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch.mirror:INFO: Syncing with https://test-deploy-release.infra.dev.atix/pulp/content/ATIX/PROXY/CV_python_test/custom/test_porduct_python/test_dot_in_name_python_module.
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulp_python.app.tasks.sync:INFO: Attempt 0 to get package list from https://test-deploy-release.infra.dev.atix/pulp/content/ATIX/PROXY/CV_python_test/custom/test_porduct_python/test_dot_in_name_python_module
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulp_python.app.tasks.sync:INFO: Syncing all packages.
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulp_python.app.tasks.sync:INFO: Attempt 1 to get package list from https://test-deploy-release.infra.dev.atix/pulp/content/ATIX/PROXY/CV_python_test/custom/test_porduct_python/test_dot_in_name_python_module
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulp_python.app.tasks.sync:INFO: Syncing all packages.
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulp_python.app.tasks.sync:INFO: Attempt 2 to get package list from https://test-deploy-release.infra.dev.atix/pulp/content/ATIX/PROXY/CV_python_test/custom/test_porduct_python/test_dot_in_name_python_module
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulp_python.app.tasks.sync:INFO: Syncing all packages.
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulp_python.app.tasks.sync:INFO: Failed to get package list using XMLRPC, trying parse simple page.
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch.mirror:INFO: No project filters are enabled. Skipping filtering
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulp_python.app.tasks.sync:INFO: 1 packages to sync.
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch.mirror:INFO: No metadata filters are enabled. Skipping metadata filtering
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch.mirror:INFO: No release file filters are enabled. Skipping release file filtering
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch.package:INFO: Fetching metadata for package: oslo-utils (serial 0)
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: bandersnatch.package:INFO: oslo-utils no longer exists on PyPI
Aug 05 09:03:02 test-proxy-release pulpcore-worker-3[44179]: pulp [2e4353d6-8ea1-4f07-a31b-3cd8817c14a4]: pulpcore.tasking.tasks:INFO: Task completed 019121c7-1a12-7c98-8034-c3d4fdc85c86

The line bandersnatch.package:INFO: oslo-utils no longer exists on PyPI sounds pretty fishy here. It sounds like something in pulp_python converted oslo.utils to oslo-utils which cannot be found, resulting in this issue. Just a hypothesis.

quba42 commented 1 month ago

Found an old issue relating to dot . in name, (but not scoped to Pulp to Pulp syncs) that was fixed years ago: https://github.com/pulp/pulp_python/issues/467 Not sure if this is relevant.

gerrod3 commented 1 month ago

Good find. This is the same type of issue as #467, we need to be using a normalized name filter for the pypi json api as we do for the html simple api. Should be a simple fix.