pypa / pip-audit

Audits Python environments, requirements files and dependency trees for known security vulnerabilities, and can automatically fix them
https://pypi.org/project/pip-audit/
Apache License 2.0
940 stars 62 forks source link

Problems authenticating to a private index #742

Closed fgsalomon closed 2 months ago

fgsalomon commented 4 months ago

Pre-submission checks

Expected behavior

Hi,

I have a project with a dependency on a package hosted in a private index. The private index is a Google Artifact Registry. This project uses a requirements.txt file to handle the dependencies.

I'm authenticating through the keyring with the Google Artifact Registry backend. I'm authenticated and have the right permissions in Google Cloud.

I can install my private package without issue by providing the extra index url:

pip install --extra-index-url MY_INDEX_URL -r requirements/requirements.txt 

However, when I run pip-audit with --extra-index-url it can't find the package:

pip-audit -vvvv --extra-index-url MY_INDEX_URL -r requirements/requirements.txt

I expected pip-audit to be able to analyze the dependencies (at least the public ones)

Actual behavior

pip-audit returns an error because it could not find the private package

Reproduction steps

  1. Have a requirements.txt file with a package hosted in a Google Artifact Registry with a correct setup of the keyring
  2. Run pip-audit -vvvv --extra-index-url MY_INDEX_URL -r requirements/requirements.txt

Logs

DEBUG:pip_audit._cli:parsed arguments: Namespace(local=False, requirements=[<_io.TextIOWrapper name='requirements/requirements.txt' mode='r' encoding='UTF-8'>], project_path=None, format=<OutputFormatChoice.Columns: 'columns'>, vulnerability_service=<VulnerabilityServiceChoice.Pypi: 'pypi'>, dry_run=False, strict=False, desc=<VulnerabilityDescriptionChoice.Auto: 'auto'>, aliases=<VulnerabilityAliasChoice.Auto: 'auto'>, cache_dir=None, progress_spinner=<ProgressSpinnerChoice.On: 'on'>, timeout=15, paths=[], verbose=4, fix=False, require_hashes=False, index_url=None, extra_index_urls=['MY_INDEX_URL'], skip_editable=False, no_deps=False, output=PosixPath('stdout'), ignore_vulns=[], disable_pip=False)
ERROR:pip_audit._virtual_env:internal pip failure:  [...]
ERROR: Could not find a version that satisfies the requirement MY_PRIVATE_PACKAGE==X.Y.Z (from versions: none)
ERROR: No matching distribution found for MY_PRIVATE_PACKAGE==X.Y.Z

ERROR:pip_audit._cli:Failed to install packages: ['/var/folders/nl/jq_nzg654wn573pkhr9949xh0000gn/T/tmpful3a4s9/bin/python3.11', '-m', 'pip', 'install', '--no-input', '--extra-index-url', 'MY_INDEX_URL', '--dry-run', '--report', '/var/folders/nl/jq_nzg654wn573pkhr9949xh0000gn/T/tmpn0nqqkdw/tmpcz4kjwr9', '-r', 'requirements/requirements.txt']

Additional context

No response

OS name, version, and architecture

Mac OS 14.2.1 Apple Silicon & Ubuntu 22.04 x86_64

pip-audit version

2.7.1

pip version

24.0

Python version

3.11

woodruffw commented 4 months ago

Thanks for the report @fgsalomon! And thanks for filling out the full template, I greatly appreciate it.

I'm looking into this now -- pip-audit should transparently delegate things like keyring authentication to pip, so this is potentially an error in how we do that.

woodruffw commented 4 months ago

Current operating theory: we added --no-input as a fix for #706, which appears to have disabled keyring lookups by default as of pip 23.1 (https://github.com/pypa/pip/pull/11698).

I think what we need to do here is pass --keyring-provider=subprocess or similar, with the presumption that the user has keyring installed somewhere on their $PATH. We can't use --keyring-provider=import by default most likely, since our default requirements behavior is to create a clean virtual environment (which won't have the keyring package in it).

@fgsalomon can you confirm that keyring is on your $PATH? If so, I think that's the most viable path forwards here, and I'll make a PR for you to test against πŸ™‚

woodruffw commented 4 months ago

Additional pip context: https://github.com/pypa/pip/issues/8719

fgsalomon commented 4 months ago

Yes, keyring is on my $PATH.

woodruffw commented 4 months ago

Thanks @fgsalomon! I've opened #743 as a prospective fix -- could you give the changes on that branch a try and see if they resolve the issue for you? If so, I'll get them merged and do a patch release ASAP.

fgsalomon commented 4 months ago

It didn't work, I got the same error. FYI, I'm using a virtual env and keyring is installed in it (my $PATH includes the virtual env). Could be this a problem?

I've put a print to see the pip command output:

[...]
Looking in indexes: https://pypi.org/simple, MY_INDEX_URL/
2 location(s) to search for versions of MY_PRIVATE_PACKAGE:
* https://pypi.org/simple/MY_PRIVATE_PACKAGE/
* MY_INDEX_URL/MY_PRIVATE_PACKAGE/
Fetching project page and analyzing links: https://pypi.org/simple/MY_PRIVATE_PACKAGE/
Getting page https://pypi.org/simple/MY_PRIVATE_PACKAGE/
Found index url https://pypi.org/simple/
Looking up "https://pypi.org/simple/MY_PRIVATE_PACKAGE/" in the cache
Request header has "max_age" as 0, cache bypassed
No cache entry available
Starting new HTTPS connection (1): pypi.org:443
https://pypi.org:443 "GET /simple/MY_PRIVATE_PACKAGE/ HTTP/1.1" 404 13
Status code 404 not in (200, 203, 300, 301, 308)
Could not fetch URL https://pypi.org/simple/MY_PRIVATE_PACKAGE/: 404 Client Error: Not Found for url: https://pypi.org/simple/MY_PRIVATE_PACKAGE/ - skipping
Fetching project page and analyzing links: MY_INDEX_URL/MY_PRIVATE_PACKAGE/
Getting page MY_INDEX_URL/MY_PRIVATE_PACKAGE/
Found index url MY_INDEX_URL/
Looking up "MY_INDEX_URL/MY_PRIVATE_PACKAGE/" in the cache
Request header has "max_age" as 0, cache bypassed
No cache entry available
Starting new HTTPS connection (1): MY_INDEX_URL:443
MY_INDEX_URL:443 "GET /simple/MY_PRIVATE_PACKAGE/ HTTP/1.1" 401 60
Found index url MY_INDEX_URL/
Keyring provider requested: subprocess
Keyring provider set: subprocess with executable /Users/fgsalomon/myproject/venv/bin/MY_INDEX_URL
Status code 401 not in (200, 203, 300, 301, 308)
Could not fetch URL MY_INDEX_URL/MY_PRIVATE_PACKAGE/: 401 Client Error: Unauthorized for url: 
MY_INDEX_URL/MY_PRIVATE_PACKAGE/ - skipping
Skipping link: not a file: https://pypi.org/simple/MY_PRIVATE_PACKAGE/
Skipping link: not a file: MY_INDEX_URL/MY_PRIVATE_PACKAGE/
Given no hashes to check 0 links for project 'MY_PRIVATE_PACKAGE': discarding no candidates
woodruffw commented 4 months ago

Thanks for checking!

Hmm, that's pretty weird -- it looks like pip queries your keyring as expected (or at least doesn't fail outright on it), but still fails to auth to your index.

FYI, I'm using a virtual env and keyring is installed in it (my $PATH includes the virtual env). Could be this a problem?

I think that should be fine -- FWICT from the pip docs, any keyring anywhere on the $PATH should work as expected, so long as it executes correctly.

fgsalomon commented 4 months ago

I think the problem is that when calling keyring as a subprocess it requires that a username is passed through the URL.

Looking at pip's code it seems that if the username is not present in the URL it will not request the credentials from the keyring. But Google Artifact Registry auth mechanism doesn't have a username as far as I know and it would be a mess to set up differently in the dev and CI/CD environments...

woodruffw commented 4 months ago

But Google Artifact Registry auth mechanism doesn't have a username as far as I know and it would be a mess to set up differently in the dev and CI/CD environments...

Hmm, could you try _json_key_base64 (that literal string) as the username? https://cloud.google.com/artifact-registry/docs/python/authentication seems to suggest that Google's Artifact Registry will accept that.

And if that doesn't work, can you try these URLs?

https://@MY_INDEX_URL
https://:@MY_INDEX_URL

I don't expect those to work, but those are semi standard ways to pass an empty-but-explicit username in a URL.


Taking a step back: if we can't get this to work via the subprocess provider, then I suppose we'll have to somehow use the import keyring provider. Two ideas:

  1. We can define a keyring extra, so that people can do pip install pip-audit[keyring] and get keyring in the same environment.
  2. We can just try to opportunistically import keyring (and hope the user has pre-installed it, and fall back to subprocess if not.
fgsalomon commented 4 months ago

Hmm, could you try _json_key_base64 (that literal string) as the username? cloud.google.com/artifact-registry/docs/python/authentication seems to suggest that Google's Artifact Registry will accept that.

And if that doesn't work, can you try these URLs?

https://@MY_INDEX_URL
https://:@MY_INDEX_URL

None of these worked 😞

We can define a keyring extra, so that people can do pip install pip-audit[keyring] and get keyring in the same environment.

Using Google Artifact Registry also requires to install keyrings.google-artifactregistry-auth. I guess other providers may need other backends. I don't think it would be plausible to add those backends as extra too, but for the normal scenario it would be a great solution.

We can just try to opportunistically import keyring (and hope the user has pre-installed it, and fall back to subprocess if not.

I thought pip-audit was creating a new virtual env, would this import work?

woodruffw commented 4 months ago

None of these worked 😞

Dang. Just to confirm: _json_key_base64 didn't work in either the URL or the .pypirc, right?

I don't think it would be plausible to add those backends as extra too, but for the normal scenario it would be a great solution.

Yeah, unfortunately I think that makes this non-workable for us.

I thought pip-audit was creating a new virtual env, would this import work?

Nope, probably not now that I think about it 😞.


Okay, unfortunately I think I'm out of ideas here. To fix this on our end, we probably need one of two things:

  1. pip --keyring-provider=subprocess needs to allow us to pass an empty username in somehow or otherwise replicate what happens by default.
  2. Google Artifact Registry needs to provide some kind of default username that we can stuff in here.

In the mean time, I think #743 will fix some use cases, but not this one unfortunately. So I'm going to merge there, and also write up some docs that link to this issue until we have a real fix here.

fgsalomon commented 4 months ago

Dang. Just to confirm: _json_key_base64 didn't work in either the URL or the .pypirc, right? It didn't work (I'm not using a service account key BTW)

To me the first option makes more sense since there are backends that don't need the username.

In the mean time, I think https://github.com/pypa/pip-audit/pull/743 will fix some use cases, but not this one unfortunately. So I'm going to merge there, and also write up some docs that link to this issue until we have a real fix here.

Great. Thanks for your help @woodruffw !

fgsalomon commented 4 months ago

I found this issue that address the username problems. But it didn't work for me πŸ€”. And also there is this other issue.

So, as you said, #743 is the only thing required in pip-audit.

Thanks again!

woodruffw commented 4 months ago

No problem, happy to help! And thank you again for your detailed report and triage efforts!

woodruffw commented 4 months ago

JFYI: I've updated #743 to also include a troubleshooting section that links to this issue. We'll keep the issue open as well πŸ™‚

fgsalomon commented 2 months ago

FYI: I'd the chance to test this again and using oauth2accesstoken as the username in the Google Artifact Registry worked as pointed here. I think this issue can be closed now. Sorry for not having followed through earlier.

woodruffw commented 2 months ago

No problem, happy to hear it's working!

I'll update the troubleshooting with that info as well.