python / cpython

The Python programming language
https://www.python.org/
Other
61.21k stars 29.53k forks source link

pickle.load() out-of-memory with small file #115952

Open gvozdila opened 4 months ago

gvozdila commented 4 months ago

Bug report

Bug description:

With some files python trying to pickle.load() use all available RAM and get killed by OOM killer.

How-to-reproduce:

import pickle
f = open("./oom-1a498cedae1dc957f2fbb15d18a5c265dc7b9ae2.txt","rb")
pickle.load(f)

where oom-1a498cedae1dc957f2fbb15d18a5c265dc7b9ae2.txt:

]replace.

Whats-happend: RAM consumption grows till process killed by OOM killer

Whats-you-expect: some kind of error?

I'll include 3 examples of "broken" pickle files. But i cannot find the way to see similarities in content. And I'm not very well versed in the pickle format to find out the root of the problem.

CPython versions tested on:

3.9, 3.12

Operating systems tested on:

Linux

Linked PRs

gvozdila commented 4 months ago

oom-1a498cedae1dc957f2fbb15d18a5c265dc7b9ae2.txt oom-37087e76e91539d4701f1019fc59f35107f1a868.txt oom-90ab361d29fda080e4094ee4cd58d3415e88fbe2.txt

Eclips4 commented 4 months ago

I'm not sure whether it's worth fixing. The pickle.load() function accepts something that is the semantic output of pickle.dump(), but ]replace. does not look like the output of pickle.dump().

ericvsmith commented 4 months ago

Agreed: if pickle.dump() can product that, we should fix it. But unpickling random data will never be safe.

gvozdila commented 4 months ago

Sometimes CPU and RAM consumption can be CVE. Not for pickle, but for other formats.

Example: plistlib.py- DoS attack via CPU and RAM by malformed Apple Property List files in binary format. CVE-2022-48564

For this case most problem is the kill by OOM-killer.

I found something similar for marshal.load(). But marshal throws ValueError: bad marshal data (unknown type code).

marshal example:

import marshal
f=open("./oom-6c95714cba63108c15c13acac86e844aec88301b","rb")
marshal.load(f)

oom-6c95714cba63108c15c13acac86e844aec88301b.txt

Eclips4 commented 4 months ago

Sometimes CPU and RAM consumption can be CVE. Not for pickle, but for other formats.

Example: plistlib.py- DoS attack via CPU and RAM by malformed Apple Property List files in binary format. CVE-2022-48564

For this case most problem is the kill by OOM-killer.

I found something similar for marshal.load(). But marshal throws ValueError: bad marshal data (unknown type code).

marshal example:

import marshal
f=open("./oom-6c95714cba63108c15c13acac86e844aec88301b","rb")
marshal.load(f)

oom-6c95714cba63108c15c13acac86e844aec88301b.txt

The same applies for the marshal: Documentation states:

Warning The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source.

ronaldoussoren commented 4 months ago

This appears to be a problem with the C implementation of pickle, the python implementation does raise an error for all examples (experimentally checked by creating a dummy _pickle.py in the CWD before invoking python -m pickle file.txt. This could indicate missing or incorrect handling of partial input in the C implementation (I haven't looked at that implementation though).

Regardless of this loading arbitrary pickle files is unsafe, as mentioned in the documentation.

serhiy-storchaka commented 4 months ago

I think that it is worth fixing. There are many tests for errors detection in broken pickle data (see for example bpo-23914/#68102, bpo-25761/#69947). While by default unpickling can execute arbitrary code and therefore is not safe, with overridden find_class it can be made more safe. But there are issues not related to execution arbitrary code, and some of them (not this one) are difficult to fix.

serhiy-storchaka commented 2 months ago

Loading a small data which does not even involve arbitrary code execution could consume arbitrary large amount of memory. There were three issues here:

The solution of the first issue can potentially break the user code that uses third-party pickle generators. In normal cases this is not an issue. The stdlib implementations always use sequential numbers without gaps (even after optimization), and the threshold is high enough, so even if the dumb third-party optimizer does not re-pack indices, the error can realistically occur only for very large data (hundreds of millions of items). But it had to be said that this is potentially a breaking change.

The third issue could be solved more generally on the lower level: in FileIO, BufferedReader and RawIOBase. This would solve similar vulnerabilities in other binary file formats and protocols. But this idea was not supported, so we must test each binary file formats and protocols separately for this vulnerability.

eli-schwartz commented 2 months ago

But I thought it was acknowledged that this isn't a vulnerability regardless of any other factors, since the pickle module is explicitly a non-security-safe domain.

It's not possible to get a security vulnerability, even by loading data that doesn't contain arbitrary code execution, since that presupposes one is permitted to decide that the security warning in the pickle module doesn't apply if you do unspecified third-party analysis of a pickle, decide that it's safe, and end up being wrong.

serhiy-storchaka commented 1 month ago

It is a non-security-safe domain because unpickling involves arbitrary code execution. But with overridden find_class() method you can control how names are resolved, and therefore what is permitted to be executed. It is not easy, but it should be enough to make unpickling of limited data security-safe. If there would not be vulnerabilities in basic opcodes.