bndr / pipreqs

pipreqs - Generate pip requirements.txt file based on imports of any project. Looking for maintainers to move this project forward.
Apache License 2.0
6.38k stars 388 forks source link

Mitigation for dependency confusion in pipreqs #364

Closed adeadfed closed 1 year ago

adeadfed commented 1 year ago

Hi folks! I am a security researcher, and I believe I have found a way to perform a dependency confusion attack on pipreqs. This pull request aims to mitigate a larger portion of the impact, however, there are still some caveats left (will be discussed later).

Vulnerability description

Impact: Arbitrary code execution on all machines running the Python code with requirements.txt file that includes one of the vulnerable packages and was generated through pipreqs.

Difficulty: The exploit needs several prerequisites listed below.

tl;dr

Pipreq's remote dependency resolving mechanism (lines 447-449) can be abused to inject arbitrary packages into the final requirements.txt file.

There are three necessary conditions for triggering this behavior:

In-depth explanation

Let’s use the djangorestframework-simplejwt package as an example. It has over 1.3M monthly downloads, according to PyPIStats (https://pypistats.org/packages/djangorestframework-simplejwt). The package exports the rest_framework_simplejwt Python module.

Consider the following snippet of code from the above package's documentation.

from rest_framework_simplejwt.views import (
    TokenObtainPairView,
    TokenRefreshView,
)

urlpatterns = [
    ...
    path('api/token/', TokenObtainPairView.as_view(), name='token_obtain_pair'),
    path('api/token/refresh/', TokenRefreshView.as_view(), name='token_refresh'),
    ...
]

When pipreqs is run on the code above, the following things happen:

  1. Pipreqs extracts the imported module name (rest_framework_simplejwt) and places it into the candidates variable at line 425.

    Untitled

  2. Pipreqs tries to find the rest_framework_simplejwt value in the mapping file at line 429 through the get_pkg_names function, but it isn’t there, so the script assumes that the package name is rest_framework_simplejwt.

    Untitled

  3. The script tries to find the rest_framework_simplejwt module in the exports of all locally installed Python packages at line 445 by default.

    If the package djangorestframework-simplejwt is installed locally, the function will find it and assign it to the local variable.

    Untitled

    If not, the function will return an empty array.

    Untitled

  4. Pipreqs then compares the names inside candidates and local variables to populate a difference variable at line 447.

    Since the candidates variable contains rest_framework_simplejwt, and the local variable is either empty or has the name of the PyPI package (djangorestframework-simplejwt), this comparison will always evaluate to True. Consequently, the code will simply copy the candidates entries into the difference variable.

    Untitled

  5. Then get_imports_info function is triggered with the differencearray (['rest_framework_simplejwt']). It tries to find PyPI packages with the same names as in the passed array.

    If a PyPI package name inside the difference array is missing from PyPI, the code will ignore it. However, if an attacker registers the name on PyPI, pipreqs will inject this malicious package to "requirements.txt". And, during the dependency installation, attacker-controlled code will be executed on the user's machine.

Proof of Concept

I've created a PoC using the rest_framework_simplejwt module mentioned above (https://pypi.org/project/rest-framework-simplejwt/).

If you run pipreqs on the code above, you will see that my package is injected into the requirements.txt:

pipreqs --print
djangorestframework_simplejwt==5.2.2 <--- legitimate package 
rest_framework_simplejwt==0.0.2      <--- malicious package

INFO: Successfully output requirements

In fact, this behavior above is the source of a 3y.o. open issue (https://github.com/bndr/pipreqs/issues/218).

What can be done about this?

This pull request reworks the local package resolution. In particular:

These 3 steps should improve the quality of the requirements.txt output for packages that are installed locally.

New pipreqs version's output for the same code above:

pipreqs --print
djangorestframework_simplejwt==5.2.2 <--- legitimate package only

INFO: Successfully output requirements

Unfortunately, there is still a fundamental issue with the Python packaging system that we cannot address in any way. get_imports_info function will anyway output a flawed requirements list if a correct package is not installed locally.

So, I added a warning message into the CLI output for users when using a remote resolution to check the list of the final requirements for the correct packages:

pipreqs --print
WARNING: Import named "rest_framework_simplejwt" not found locally. Trying to resolve it at the PyPI server.
WARNING: Import named "rest_framework_simplejwt" was resolved to "rest-framework-simplejwt:0.0.2" package (https://pypi.org/project/rest-framework-simplejwt/).
Please, verify manually the final list of requirements.txt to avoid possible dependency confusions.
rest_framework_simplejwt==0.0.2     <--- malicious package  
INFO: Successfully output requirements

This bug is also a good reasoning to make --use-local flag a default option, and create a --use-remote flag to use a remote package name resolution. However, this change would violate one of the pipreqs' use cases, so I will leave this proposal for a public discussion.

bndr commented 1 year ago

Hey @adeadfed, I'm ready to merge your fix once the linting issues are fixed.

Cheers, Vadim

adeadfed commented 1 year ago

Thanks @bndr! I've fixed the PEP8 linting, should be good now.

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.25 :tada:

Comparison is base (f97d8b9) 86.71% compared to head (2103371) 86.97%.

:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #364 +/- ## ========================================== + Coverage 86.71% 86.97% +0.25% ========================================== Files 2 2 Lines 256 261 +5 ========================================== + Hits 222 227 +5 Misses 34 34 ``` | [Impacted Files](https://codecov.io/gh/bndr/pipreqs/pull/364?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None) | Coverage Δ | | |---|---|---| | [pipreqs/pipreqs.py](https://codecov.io/gh/bndr/pipreqs/pull/364?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-cGlwcmVxcy9waXByZXFzLnB5) | `86.82% <100.00%> (+0.26%)` | :arrow_up: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.