pypa / pip-audit

Audits Python environments, requirements files and dependency trees for known security vulnerabilities, and can automatically fix them
https://pypi.org/project/pip-audit/
Apache License 2.0
979 stars 62 forks source link

Failure to find dependency that was installed from extra index url #599

Open cornelius-braun opened 1 year ago

cornelius-braun commented 1 year ago

Bug description

I created a requirements file for my project using pip-compile. To get the correct version, I added an extra url for the torch installation, resulting in the following command:

pip-compile --extra-index-url  https://download.pytorch.org/whl/cpu

This gives me the following requirements.txt

filelock==3.12.0
jinja2==3.1.2
markupsafe==2.1.2
mpmath==1.3.0
networkx==3.1
sympy==1.11.1
torch==2.0.0
typing-extensions==4.5.0

When I run pip-audit on this, I get the issue that torch is skipped from the auditing:

No known vulnerabilities found
Name  Skip Reason
----- ------------------------------------------------------------------------
torch Dependency not found on PyPI and could not be audited: torch (2.0.0+cpu)

Is this a bug or am I misusing pip-audit?

Reproduction steps

I generated my requirements using

pip-compile --extra-index-url  https://download.pytorch.org/whl/cpu
pip-sync

Then I ran

pip-audit -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu

Expected behavior

Auditing of all packages including torch.

Screenshots and logs

The logs are as follows:

DEBUG:pip_audit._cli:Auditing filelock (3.12.0)
DEBUG:pip_audit._cli:Auditing Jinja2 (3.1.2)
DEBUG:pip_audit._cli:Auditing MarkupSafe (2.1.2)
DEBUG:pip_audit._cli:Auditing mpmath (1.3.0)
DEBUG:pip_audit._cli:Auditing networkx (3.1)
DEBUG:pip_audit._cli:Auditing sympy (1.11.1)
DEBUG:pip_audit._service.pypi:Dependency not found on PyPI and could not be audited: torch (2.0.0+cpu)
DEBUG:pip_audit._cli:Auditing typing_extensions (4.5.0)
No known vulnerabilities found
Name  Skip Reason
----- ------------------------------------------------------------------------
torch Dependency not found on PyPI and could not be audited: torch (2.0.0+cpu)

Platform information

woodruffw commented 1 year ago

Thanks for the report @cornelius-braun, and for filling our each section! We greatly appreciate it.

Is this a bug or am I misusing pip-audit?

This is a somewhat tricky case:

  1. pip-audit fundamentally relies on PyPI for vulnerability information, which means that it can only supply vulnerability reports for packages that appear on PyPI. 2.0.0+cpu is a distinct version from 2.0.0 and the former is only on your extra index, so pip-audit is arguably correct in reporting that it couldn't find a auditable dependency with that name and version on PyPI.
  2. At the same time, 2.0.0+cpu is really just 2.0.0 with a PEP 440 "local version identifier" of cpu. These are applied most often by Linux and similar distributions, e.g. foopkg-1.0.0+ubuntu.0. Local versions are sometimes considered equivalent to their non-local counterparts, but not always: they sometimes carry extra patches, or imply different build processes, dependencies, etc. I don't know a ton about Torch, but I suspect that the cpu local tag implies that something is different about the build here; given that, I'm not sure if it would be sound of us to return non-local audit results for it.

To summarize: this boils down to a question of whether pip-audit should consider "bare" and "local tagged" versions with the same basic version the same for auditing purposes, i.e. whether we should normalize 2.0.0+cpu to 2.0.0.

Argument for: Even when different, vulnerabilities reported in X.Y.Z may be of interest to people running X.Y.Z+foo. We should err on the side of caution and report vulnerabilities for the same "base" version, since it's a stronger signal than not.

Argument against: When a package reports its version as X.Y.Z+foo, they're telling us something different and important than X.Y.Z. We arguably shouldn't override that intent.

CC @tetsuo-cpp and @di for thoughts. I'm personally inclined to say that we should support "normalizing" local versions into their "base" version, although perhaps behind an option or flag that isn't enabled by default.

di commented 1 year ago

Agreed. PyPI can't distribute vulnerability data for releases that aren't on PyPI (no matter how similar the version numbers look).

@cornelius-braun, I'm curious, when you saw the "Dependency not found on PyPI and could not be audited", was this clear enough? Is there more we could do here to say "you installed something we've never seen before, we have no way to tell you if there are known vulnerabilities for it?"

(As an aside, if we standardized the vulnerability API, the pytorch index could offer vulnerability details here, but that is a much bigger effort)

cornelius-braun commented 1 year ago

Thank you both for your elaborate replies!

@cornelius-braun, I'm curious, when you saw the "Dependency not found on PyPI and could not be audited", was this clear enough? Is there more we could do here to say "you installed something we've never seen before, we have no way to tell you if there are known vulnerabilities for it?"

To me, it was clear that you could not find information about the torch installation because it was not found on PyPi.

Since an --extra-index-url flag is supported, I was not sure, however, if you were checking for other vulnerability sources as well, as this permits to install packages from outside of PyPi.

Based on your explanations, your procedure now makes complete sense to me.

di commented 1 year ago

I think we do want to support this eventually, but we could make it more clear that it's not currently supported.

woodruffw commented 1 year ago

Agreed! I think we can improve the user experience here with the following:

If the user passes --index-url or --extra-index-url, we should emit a warning telling them that the PyPI vulnerability source won't necessarily report vulnerabilities for dependencies resolved from their sources.

woodruffw commented 1 year ago

Assigned to both myself and @tnytown, we'll triage it based on availability during the sprints.

tufanalbayrak commented 5 months ago

Hi. I also have the same problem. pip-audit fails to find cpu versions of torch and torchvision on PyPI. Is there any progress here? Thanks.

woodruffw commented 5 months ago

Hi. I also have the same problem. pip-audit fails to find cpu versions of torch and torchvision on PyPI. Is there any progress here? Thanks.

That sounds like a different issue, since this issue is about third party index URL handling. Could you please file a separate issue and include an example for us to reproduce your problem with?