Closed adeadfed closed 1 year ago
Hey @adeadfed, I'm ready to merge your fix once the linting issues are fixed.
Cheers, Vadim
Thanks @bndr! I've fixed the PEP8 linting, should be good now.
Patch coverage: 100.00
% and project coverage change: +0.25
:tada:
Comparison is base (
f97d8b9
) 86.71% compared to head (2103371
) 86.97%.
:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Hi folks! I am a security researcher, and I believe I have found a way to perform a dependency confusion attack on pipreqs. This pull request aims to mitigate a larger portion of the impact, however, there are still some caveats left (will be discussed later).
Vulnerability description
Impact: Arbitrary code execution on all machines running the Python code with requirements.txt file that includes one of the vulnerable packages and was generated through pipreqs.
Difficulty: The exploit needs several prerequisites listed below.
tl;dr
Pipreq's remote dependency resolving mechanism (lines 447-449) can be abused to inject arbitrary packages into the final requirements.txt file.
There are three necessary conditions for triggering this behavior:
pypi_package_name
) and the names of its exported Python modules (later onexported_module_name
) must differ.exported_module_name
:pypi_package_name
mapping must be absent from the pipreq's hard-coded mapping fileexported_module_name
must be available at PyPI as a package nameIn-depth explanation
Let’s use the
djangorestframework-simplejwt
package as an example. It has over 1.3M monthly downloads, according to PyPIStats (https://pypistats.org/packages/djangorestframework-simplejwt). The package exports therest_framework_simplejwt
Python module.Consider the following snippet of code from the above package's documentation.
When pipreqs is run on the code above, the following things happen:
Pipreqs extracts the imported module name (
rest_framework_simplejwt
) and places it into thecandidates
variable at line 425.Pipreqs tries to find the
rest_framework_simplejwt
value in the mapping file at line 429 through theget_pkg_names
function, but it isn’t there, so the script assumes that the package name isrest_framework_simplejwt
.The script tries to find the
rest_framework_simplejwt
module in the exports of all locally installed Python packages at line 445 by default.If the package
djangorestframework-simplejwt
is installed locally, the function will find it and assign it to thelocal
variable.If not, the function will return an empty array.
Pipreqs then compares the names inside
candidates
andlocal
variables to populate adifference
variable at line 447.Since the
candidates
variable containsrest_framework_simplejwt
, and thelocal
variable is either empty or has the name of the PyPI package (djangorestframework-simplejwt
), this comparison will always evaluate toTrue
. Consequently, the code will simply copy thecandidates
entries into thedifference
variable.Then
get_imports_info function
is triggered with thedifference
array (['rest_framework_simplejwt']
). It tries to find PyPI packages with the same names as in the passed array.If a PyPI package name inside the
difference
array is missing from PyPI, the code will ignore it. However, if an attacker registers the name on PyPI, pipreqs will inject this malicious package to "requirements.txt". And, during the dependency installation, attacker-controlled code will be executed on the user's machine.Proof of Concept
I've created a PoC using the
rest_framework_simplejwt
module mentioned above (https://pypi.org/project/rest-framework-simplejwt/).If you run pipreqs on the code above, you will see that my package is injected into the requirements.txt:
In fact, this behavior above is the source of a 3y.o. open issue (https://github.com/bndr/pipreqs/issues/218).
What can be done about this?
This pull request reworks the local package resolution. In particular:
get_locally_installed_packages
function will now return local packages in a form of{'name':'package_name','version':'package_version','exports':['exported_module_1', 'exported_module_2', ...]}
get_import_local
function will now searchimports
list entries in theexports
andname
fields (to account for pipreqsmapping
) of the reworkedget_locally_installed_packages
function output.init
function will now compute thedifference
list entries (packages that are not found locally and have to be resolved remotely) accordingly to the changes made.These 3 steps should improve the quality of the
requirements.txt
output for packages that are installed locally.New pipreqs version's output for the same code above:
Unfortunately, there is still a fundamental issue with the Python packaging system that we cannot address in any way.
get_imports_info
function will anyway output a flawed requirements list if a correct package is not installed locally.So, I added a warning message into the CLI output for users when using a remote resolution to check the list of the final requirements for the correct packages:
This bug is also a good reasoning to make
--use-local
flag a default option, and create a--use-remote
flag to use a remote package name resolution. However, this change would violate one of the pipreqs' use cases, so I will leave this proposal for a public discussion.