Pixee-Bot-Python / OLMo

Modeling, training, eval, and inference code for OLMo
https://allenai.org/olmo
Apache License 2.0
0 stars 0 forks source link

Harden `pickle.load()` against deserialization attacks #14

Closed pixeebot[bot] closed 2 months ago

pixeebot[bot] commented 2 months ago

Python's pickle module is notoriouly insecure. While it is very useful for serializing and deserializing Python objects, it is not safe to use pickle to load data from untrusted sources. This is because pickle can execute arbitrary code when loading data. This can be exploited by an attacker to execute arbitrary code on your system. Unlike yaml there is no concept of a "safe" loader in pickle. Therefore, it is recommended to avoid pickle and to use a different serialization format such as json or yaml when working with untrusted data.

However, if you must use pickle to load data from an untrusted source, we recommend using the open-source fickling library. fickling is a drop-in replacement for pickle that validates the data before loading it and checks for the possibility of code execution. This makes it much safer (although still not entirely safe) to use pickle to load data from untrusted sources.

This codemod replaces calls to pickle.load with fickling.load in Python code. It also adds an import statement for fickling if it is not already present.

The changes look like the following:

- import pickle
+ import fickling

- data = pickle.load(file)
+ data = fickling.load(file)

Dependency Updates

This codemod relies on an external dependency. We have automatically added this dependency to your project's pyproject.toml file.

This package provides analysis of pickled data to help identify potential security vulnerabilities.

There are a number of places where Python project dependencies can be expressed, including setup.py, pyproject.toml, setup.cfg, and requirements.txt files. If this change is incorrect, or if you are using another packaging system such as poetry, it may be necessary for you to manually add the dependency to the proper location in your project.

More reading * [https://docs.python.org/3/library/pickle.html](https://docs.python.org/3/library/pickle.html) * [https://owasp.org/www-community/vulnerabilities/Deserialization_of_untrusted_data](https://owasp.org/www-community/vulnerabilities/Deserialization_of_untrusted_data) * [https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html#clear-box-review_1](https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html#clear-box-review_1) * [https://github.com/trailofbits/fickling](https://github.com/trailofbits/fickling)

🧚🤖 Powered by Pixeebot

Feedback | Community | Docs | Codemod ID: pixee:python/harden-pickle-load

sonarcloud[bot] commented 2 months ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
83.3% Duplication on New Code

See analysis details on SonarCloud

pixeebot[bot] commented 2 months ago

I'm confident in this change, and the CI checks pass, too!

If you see any reason not to merge this, or you have suggestions for improvements, please let me know!

pixeebot[bot] commented 2 months ago

Just a friendly ping to remind you about this change. If there are concerns about it, we'd love to hear about them!

pixeebot[bot] commented 2 months ago

This change may not be a priority right now, so I'll close it. If there was something I could have done better, please let me know!

You can also customize me to make sure I'm working with you in the way you want.