pypa / pip-audit

Audits Python environments, requirements files and dependency trees for known security vulnerabilities, and can automatically fix them
https://pypi.org/project/pip-audit/
Apache License 2.0
940 stars 62 forks source link

Support Poetry via `poetry.lock` #84

Open woodruffw opened 2 years ago

woodruffw commented 2 years ago

Breakout from #73.

di commented 1 year ago

Overall, I think Poetry users would probably be better served by a poetry audit subcommand or something similar. I'd be happy exposing underlying generic functionality here (via a public API) to support that command.

If that happens, we should probably explore renaming this to audit or something similar, to make it clear that it's not pip-specific.

di commented 1 year ago

Or, spinning out generic functionality into audit, and keeping pip-audit as a pip-specific wrapper around that functionality.

woodruffw commented 1 year ago

If that happens, we should probably explore renaming this to audit or something similar, to make it clear that it's not pip-specific.

This makes sense to me -- it looks like audit is taken on PyPI, so in the style of pip's generic resolver being resolvelib we could make it into auditlib 🙂

From there, pip-audit would remain pretty much the same -- the only things that would change would be moving all of the interface.py files over to auditlib and depending on it.

di commented 1 year ago

I think we can definitely acquire audit -- hasn't seen an update since 2010.

orsinium commented 1 year ago

I hoped to use pip-audit in our internal poetry-based project, but I can't without the support for poetry.lock (which, as you can see in the PR, is quite short). having poetry audit would be great, but I'm not sure if and when it will happen. When I did dephell, the biggest difference between dephell and anything else (including poetry) was having everything out-of-the-box (including, unsurprisingly, dephell deps audit). And dephell, despite being archived 2 years ago or so, still has more stuff than poetry will ever have to offer. So, I don't have high hopes for poetry, and I'd be happy if pip-audit takes the lead on this initiative. If not, oh well, I'll look for alternatives.

severo commented 1 year ago

I currently export from poetry to requirements.txt, then process the file:

bash -c 'poetry run pip-audit -r <(poetry export -f requirements.txt --with dev)'

But there is an issue (described in https://github.com/python-poetry/poetry-plugin-export/issues/129 and https://github.com/python-poetry/poetry-plugin-export/issues/157) when the requirements.txt file contains lines like this (I removed the hashes for readability):

requests==2.28.1 ; python_full_version == "3.9.6"
requests[socks]==2.28.1 ; python_full_version == "3.9.6"

pip-audit complains about duplicate requirements:

ERROR:pip_audit._cli:package requests has duplicate requirements: requests[socks]==2.28.1 (from RequirementLine(line_number=1992, line='requests[socks]==2.28.1 ; python_full_version == "3.9.6"     --hash=sha256:7c5599b102feddaa661c826c56ab4fee28bfd17f5abca1ebbe3e7f19d7c97983     --hash=sha256:8fefa2a1a1365bf5520aac41836fbee479da67864514bdb821f31ce07ce65349', filename=PosixPath('/dev/fd/63')))

Note that the poetry team does not consider this as a bug in the export command; see why here: https://github.com/python-poetry/poetry/pull/5688

Is there a workaround? Should I open a dedicated issue for that?

woodruffw commented 1 year ago

Yeah, could you open a separate issue for that? We generally try to match the same restrictions that pip placed on requirements files, so we'll need to check what they do.

koyeung commented 1 year ago

I currently export from poetry to requirements.txt, then process the file:

bash -c 'poetry run pip-audit -r <(poetry export -f requirements.txt --with dev)'

But there is an issue (described in python-poetry/poetry-plugin-export#129 and python-poetry/poetry-plugin-export#157) when the requirements.txt file contains lines like this (I removed the hashes for readability):

requests==2.28.1 ; python_full_version == "3.9.6"
requests[socks]==2.28.1 ; python_full_version == "3.9.6"

pip-audit complains about duplicate requirements:

ERROR:pip_audit._cli:package requests has duplicate requirements: requests[socks]==2.28.1 (from RequirementLine(line_number=1992, line='requests[socks]==2.28.1 ; python_full_version == "3.9.6"     --hash=sha256:7c5599b102feddaa661c826c56ab4fee28bfd17f5abca1ebbe3e7f19d7c97983     --hash=sha256:8fefa2a1a1365bf5520aac41836fbee479da67864514bdb821f31ce07ce65349', filename=PosixPath('/dev/fd/63')))

Note that the poetry team does not consider this as a bug in the export command; see why here: python-poetry/poetry#5688

Is there a workaround? Should I open a dedicated issue for that?

It would be great to have poetry support in pip-audit, but it might change pip-audit scope. On the other hand, I've done a plugin https://pypi.org/project/ko-poetry-audit-plugin/ on poetry side. See if it could be helpful.

lhoestq commented 1 year ago

The latest version of pip-audit 2.5.4 fixes the issue with requirements with extras/duplicates :)

see https://github.com/pypa/pip-audit/issues/564

acdha commented 1 year ago

I realize this is a somewhat complex issue but I was thinking that it might good to have at least some basic support because right now pip-audit's behaviour can give users the false impression that they're protected. If you have a Poetry project and you run pip-audit one of two things will happen:

  1. Your virtualenv is stored outside the project, which means that pip-audit will run and say no vulnerabilities were found but that's just saying that nothing was scanned.
  2. Your virtualenv is stored inside the project (i.e. poetry config virtualenvs.in-project true) and pip-audit inspects the virtualenv and may even report vulnerabilities and fix them.

That second case in particular seems like somewhat dangerous since it looks like it's doing something but it's only fixing the developer's local install, to the point where I think it might be preferable if pip-audit just threw an exception saying it found no sources to scan. Maybe it would be a good long-term plan to migrate to having pip-audit list the auto-detected sources and perhaps even fail if it can't find one of the supported lock-file formats.

I appreciate the challenge around supporting external tools like Poetry. Since #402 doesn't support the fix mode anyway, I was almost wondering whether it would make sense to only touch Poetry's external interface by using the output of poetry export when poetry.lock exists. That's not as nice as what's in #402 in one way but would avoid needing to track Poetry's format.

woodruffw commented 1 year ago

Thanks for bringing this up @acdha!

Long term, I think our plan is still to encourage this by devolving pip-audit into a library-only audit package, which poetry could then integrate with.

I was almost wondering whether it would make sense to only touch Poetry's external interface by using the output of poetry export when poetry.lock exists. That's not as nice as what's in https://github.com/pypa/pip-audit/pull/402 in one way but would avoid needing to track Poetry's format.

I'm hesitant to do this: it requires us to shell out to a command that we don't actually have a dependency on. I don't know Poetry well enough to have a well-formed opinion here, but I'm also concerned that detecting based on poetry.lock will still snarl users who don't use Poetry's lockfile support (I believe this is an officially supported use pattern)?

In terms of nudging these users away from potentially misleading invocations, I'm thinking that maybe it makes sense to emit a warning on stderr if pip-audit is run within a directory that contains a poetry.lock or pyproject.toml with [tool.poetry], similar to what we do when users pass us error-prone options like --no-deps.

What do you think about that? That doesn't improve our support story, unfortunately, but it better clarifies what users should expect from pip-audit in Poetry-based projects, and leaves open the longer-term audit plan (since it would only happen at the pip-audit CLI layer, not in the core library).

acdha commented 1 year ago

I was almost wondering whether it would make sense to only touch Poetry's external interface by using the output of poetry export when poetry.lock exists. That's not as nice as what's in #402 in one way but would avoid needing to track Poetry's format.

I'm hesitant to do this: it requires us to shell out to a command that we don't actually have a dependency on. I don't know Poetry well enough to have a well-formed opinion here, but I'm also concerned that detecting based on poetry.lock will still snarl users who don't use Poetry's lockfile support (I believe this is an officially supported use pattern)?

In terms of nudging these users away from potentially misleading invocations, I'm thinking that maybe it makes sense to emit a warning on stderr if pip-audit is run within a directory that contains a poetry.lock or pyproject.toml with [tool.poetry], similar to what we do when users pass us error-prone options like --no-deps.

That definitely makes sense – I appreciate the concerns about tighter coupling to someone else's project, especially since they rather disappointingly rejected the idea of including an audit command, but I do think it would be worth making it obvious to users that a bare pip-audit is not doing what they probably think it's doing. I was wondering about a pre-commit hook for this since it'd be nice to be able to at least opt-in to something which works with the most common Poetry usage pattern.

The other related change I was thinking might be useful is making it clear what's being inspected – it seems a little too easy to run pip-audit in a project directory and not realize that what was scanned wasn't the current directory but the Python environment which pip-audit is running from. I'm wondering whether that might be worth a simple “Scanning Python install in ” message, especially if stdout is a TTY.

woodruffw commented 1 year ago

it seems a little too easy to run pip-audit in a project directory and not realize that what was scanned wasn't the current directory but the Python environment which pip-audit is running from. I'm wondering whether that might be worth a simple “Scanning Python install in ” message, especially if stdout is a TTY.

Strongly agreed! I'd prefer it go to stderr instead (and through the logger, since that's where we put other warnings), but I think this would be great to have.

Would you be interested in making a PR for that, along with a change to warn users about potentially unintuitive behavior when run on a Poetry-based project directory? No problem if not, I should have some free time to make these changes in the coming days.

acdha commented 12 months ago

I’m a little busy this week but we could use that at work. Let me take a stab at it.

PhorstenkampFuzzy commented 2 weeks ago

We habe the same issue. Any news here?

orsinium commented 2 weeks ago

Any news here?

I don't think so. My PR #402 is floating in space:

https://github.com/pypa/pip-audit/pull/402#issuecomment-1305814733

PhorstenkampFuzzy commented 2 weeks ago

This is so sad. Do you have a workaround for requirements.txt with hashes that works good? Any suggestions?

orsinium commented 2 weeks ago

For poetry projects, you have 2 options:

  1. export requirements.txt and then analyze that.
  2. Use safety or snyk

If poetry isn't a requirement, you can use pip-tools or uv to generate and manage a constraints.txt lock file from pyproject.toml-based projects (I personally use flit). It works great but requires a bit more boilerplate than poetry. The uv team is on their way to fix it, I'm looking forward to it.

woodruffw commented 2 weeks ago

The statue quo is this:

TL;DR: @orsinium's recommendations are right: if you want to use this with Poetry or another non-standard Python packaging tool, you need to first export your dependencies into a format that the standard tools use: either a standard pyproject.toml, or a requirements.txt.

Alternatively, you can always run pip-audit on the virtual environment itself, i.e. install it into whatever venv poetry is managing and have it audit that. That will always work regardless of whatever packaging tool you're using, since it operates solely on the active environment.


In the short term, there are two things that a contributor could do to help out here:

  1. We could raise a warning or error on the case mentioned in https://github.com/pypa/pip-audit/issues/84#issuecomment-1612006803
  2. The README could use a "tips and tricks" entry for poetry around exporting to requirements, similar to the current pipenv one: https://github.com/pypa/pip-audit?tab=readme-ov-file#running-against-a-pipenv-project

If anybody is interested in sending a PR for either of those, I would be happy to review it.

PhorstenkampFuzzy commented 2 weeks ago

@woodruffw Thank you for your explanation. That made many things much clearer. Could you explain, for general understanding, why pip-audit has a problem with these kinds of duplicate dependencies where one uses extras and the other does not? Sorry this question may belong more to issue #662

PhorstenkampFuzzy commented 2 weeks ago

I withdraw the question. I think if issue #662 is solved most can just use the poetry export piped to pip-audit. Without a solution of #662 this will be hit or miss.