Open grzleadams opened 4 months ago
I moved the issue to the pulp_python repository since the issue is with this plugin.
I would try to remove the simple/
part from the remote url and retry the sync. I not entirely sure how JFrog sets up their pypi repository, but assuming that https://<host>/artifactory/pypi/pypi-local/
is the base index page for your repository then this should be the url you use for your remote.
It is still possible we might not fully support authenticated syncs, atleast through normal use of the remote's username
and password
field. If you are still getting 401s, try using the https://<username>:<password>@<host>/.../
url form.
Unfortunately, I did try removing simple/
(and a bunch of other permutations of the URL) and nothing worked. I did try to set the URL to include the credentials but Pulp wouldn't let me (I get the url contains username or password
error I mentioned before). Is there a way to work around that/set it directly on the remote without using the CLI/API (I assume the validation happens either way)?
If you want to directly set the url on the remote without validation you can do it through the shell. On the pulp instance run pulpcore-manager shell_plus
, this should bring up a python shell with some classes already imported. Try:
py_remote = Remote.objects.get(name="your_python_remote_name")
py_remote.url = "https://<username>:<password>@<host>/artifactory/pypi/pypi-local/"
py_remote.save()
This should bypass the validation done through the API.
Is shell_plus
available in 3.49.1/the minimal image?
Unknown command: 'shell_plus'. Did you mean shell?
It might not be. Instead use its suggestion pulpcore-manager shell
and then add this line to the top: from pulpcore.app.models import Remote
.
Setting the credentials in the URL seems to have worked, so we're not getting the unauthenticated user business anymore.
2024-05-16T18:01:01.122Z|<thread_id>|<ipaddress>|<authenticated_user>|GET|/api/pypi/pypi-local/simple/pypi/<module_name>/json|404|-1|0|3|bandersnatch/6.1.0 (cpython 3.9.18-final0, Linux x86_64) (aiohttp 3.9.3)
The 404 appears to be related to both simple/pypi
and /json
; fixing both gives an HTML response that lists all available module versions. Are those two things required by the PyPI API spec?
Both /simple/
and /pypi/<package_name>/json
are PyPI APIs. /simple/
is used by pip
for package installs and /pypi/*
is used by bandersnatch
(the tool Pulp uses under the hood) for syncing. When specifying the url for syncing you should only use the base-url of your index, no /simple/
or /pypi/*
as bandersnatch
will add the /pypi
ending itself.
Do you know if bandersnatch
will follow redirects? Apparently JFrog is doing something with their reverse proxy that requires it (for example, to just curl
the simple index you need -L
.
It should follow redirects, and same with pip
as well.
I looked through the worker logs and it looks like Pulp finds the package list (there are in fact 26 packages to sync) but hits .netrc errors when trying to pull them:
pulp []: pulpcore.tasking.tasks:INFO: Starting task <task_id>
pulp []: bandersnatch:INFO: Initialized release plugin blocklist_release, filtering []
pulp []: bandersnatch.mirror:INFO: Syncing with https://<url>/artifactory/api/pypi/pypi-local.
pulp []: pulp_python.app.tasks.sync:INFO: Attempt 0 to get package list from https://<url>/artifactory/api/pypi/pypi-local
pulp []: pulp_python.app.tasks.sync:INFO: Syncing all packages.
pulp []: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
pulp []: pulp_python.app.tasks.sync:INFO: Attempt 1 to get package list from https://<url>/artifactory/api/pypi/pypi-local
pulp []: pulp_python.app.tasks.sync:INFO: Syncing all packages.
pulp []: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
pulp []: pulp_python.app.tasks.sync:INFO: Attempt 2 to get package list from https://<url>/artifactory/api/pypi/pypi-local
pulp []: pulp_python.app.tasks.sync:INFO: Syncing all packages.
pulp []: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
pulp []: pulp_python.app.tasks.sync:INFO: Failed to get package list using XMLRPC, trying parse simple page.
pulp []: bandersnatch.mirror:INFO: No project filters are enabled. Skipping filtering
pulp []: pulp_python.app.tasks.sync:INFO: 26 packages to sync.
pulp []: bandersnatch.mirror:INFO: No metadata filters are enabled. Skipping metadata filtering
pulp []: bandersnatch.mirror:INFO: No release file filters are enabled. Skipping release file filtering
pulp []: bandersnatch.package:INFO: Fetching metadata for package: <module> (serial 0)
pulp []: aiohttp.client:WARNING: Could not read .netrc file: [Errno 2] No such file or directory: '.fake-netrc'
pulp []: bandersnatch.package:INFO: <module> no longer exists on PyPI
<snip>
pulp []: pulpcore.tasking.tasks:INFO: Task completed <task_id>
Those Could not read .netrc file:
warnings are harmless, they were fixed in pulp_python 3.11.1, but they shouldn't affect the sync.
Can you check the output of the sync task? The logs say it completed, so it should give info on how many packages it synced. pulp task show --href <task_href>
or --uuid <task_id>
.
If the number of synced packages is zero then can you try to curl https://<url>/artifactory/api/pypi/pypi-local/pypi/<package_name>/json
and see if it responds with a json of that package's metadata? This should be the endpoint that the sync is trying for each package it is syncing.
@grzleadams Did you ever get the sync to work?
No, we were in a bit of a time crunch so I just downloaded all the files and added them to Pulp manually.
I see. Well when you have time, I am willing to continue helping out to resolve this issue, else we can close it if no longer needed.
Version Deployed via Operator:
Describe the bug I set up a Pulp python remote pointing at a local JFrog pypi repository (
"url": "https://<redacted>/artifactory/api/pypi/pypi-local/simple"
), providing valid credentials in the process (withusername
andpassword
), and linked it with a Pulp python repository and distribution. However, it appears that the credentials are not being passed during the requests when syncing, or the URL is being malformed, or something. From JFrog logs (note thenon_authenticated_user
and401
):For what it's worth, that URL also looks strange... I would expect
.../simple/<redacted>/json
, not.../simple/pypi/<redacted>/json
. It's worth noting that Artifactory requires the username/password to be included in the URL but Pulp prevents that:To Reproduce Steps to reproduce the behavior:
Expected behavior The sync should happen successfully.
Additional context N/A