PyO3 / pyo3

Rust bindings for the Python interpreter
https://pyo3.rs
Apache License 2.0
12.39k stars 765 forks source link

Soundness issue: users can create out-of-bounds reads and writes using `PyClassInitializer` #4452

Open ChayimFriedman2 opened 3 months ago

ChayimFriedman2 commented 3 months ago

Bug Description

PyClassInitializer can be given a (chain of) struct to initialize the class, or a Py<Struct>. But in the second case, it propagates the instance down the inheritance line, when subclasses gets a pointer to it and treat it like it were of their type - while it was originally an other type, with less bytes! As such, a user can exploit this to cause a OOB read/write.

Steps to Reproduce

The following code (you may need to increase the bytes count):

use pyo3::prelude::*;

#[pyclass(subclass)]
pub struct BaseClass {}

#[pymethods]
impl BaseClass {
    #[new]
    pub fn new() -> Self {
        Self {}
    }
}

#[pyclass(extends=BaseClass)]
pub struct SubClass {
    data: [u8; 100_000],
}

#[pymethods]
impl SubClass {
    #[new]
    pub fn new(py: Python<'_>) -> PyResult<PyClassInitializer<Self>> {
        Ok(
            PyClassInitializer::from(Py::new(py, BaseClass::new())?).add_subclass(SubClass {
                data: [1; 100_000],
            }),
        )
    }
}

/// A Python module implemented in Rust.
#[pymodule]
fn oob(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_class::<BaseClass>()?;
    m.add_class::<SubClass>()?;
    Ok(())
}

Crashes the Python interpreter on my system when SubClass is instantiated, probably due to a page fault. Really, anything can happen here.

Backtrace

No response

Your operating system and version

Windows 11

Your Python version (python --version)

Python 3.12.0

Your Rust version (rustc --version)

rustc 1.80.1 (3f5fd8dd4 2024-08-06)

Your PyO3 version

0.22.2

How did you install python? Did you use a virtualenv?

The official website.

Additional Info

Unfortunately, I don't believe this bug can be fixed, since I don't think there exists a Python API that allows us to construct a new object, with a custom target class, that is a copy of an existing (base) class.

A partial fix may be to check we are using the same class (this can be enforced at compile time even). However, I believe it's unlikely that users that provide a value to PyClassInitializer actually want to overwrite it, rather they probably want to duplicate it (which is impossible). Furthermore, this is still unsound with frozen classes (we can forbid them too, though).

I think this safest way will be for this option to be removed.

Discovered while working on https://github.com/PyO3/pyo3/issues/4443.

alex commented 3 months ago

Looks like this was a bug introduced with the "existing" support.

I think the fix here is that add_subclass should error when PyClassInitializerImpl is an Existing.

ChayimFriedman2 commented 3 months ago

I think the fix here is that add_subclass should error when PyClassInitializerImpl is an Existing.

I mentioned this (in a slightly different variant) in my post, and explained why I think it's a bad idea:

A partial fix may be to check we are using the same class (this can be enforced at compile time even). However, I believe it's unlikely that users that provide a value to PyClassInitializer actually want to overwrite it, rather they probably want to duplicate it (which is impossible). Furthermore, this is still unsound with frozen classes (we can forbid them too, though).

alex commented 3 months ago

I'm not sure I follow why you think rejecting add_subclass with Existing is a partial fix? AFAICT that's the entire source of the problem.

ChayimFriedman2 commented 3 months ago

It's a partial fix because (a) frozen classes, and (2) I don't believe it captures the intent of the user, as I explained.

alex commented 3 months ago

Can you explain the issue with frozen classes? It seems like a distinct issue from the subclassing issue.

With respect to Existing, the behavior has never worked, and could not work. I don't think it matters what the user intended, we can't give them what they want, so erroring is better :-)

davidhewitt commented 3 months ago

Ouch. If I had to make a suggestion, we should consider removing the Existing variant and solving #3291 for a more general way to meet the same user need.

As I think @ChayimFriedman2 you have now discovered with the recent issues / reports, the initialization code is long due an overhaul!

alex commented 3 months ago

FWIW, here's a patch that expresses what I was describing:

diff --git a/src/pyclass_init.rs b/src/pyclass_init.rs
index 01983c79b..e1d88850d 100644
--- a/src/pyclass_init.rs
+++ b/src/pyclass_init.rs
@@ -202,6 +202,9 @@ impl<T: PyClass> PyClassInitializer<T> {
         S: PyClass<BaseType = T>,
         S::BaseType: PyClassBaseType<Initializer = Self>,
     {
+        if matches!(self.0, PyClassInitializerImpl::Existing(..)) {
+            panic!("You cannot add a subclass to an existing value.");
+        }
         PyClassInitializer::new(subclass_value, self)
     }

diff --git a/tests/test_class_new.rs b/tests/test_class_new.rs
index fb5ca91db..cbdfbf6d3 100644
--- a/tests/test_class_new.rs
+++ b/tests/test_class_new.rs
@@ -323,3 +323,15 @@ fn test_new_class_method() {
         pyo3::py_run!(py, cls, "assert cls().cls is cls");
     });
 }
+
+#[pyo3::pyclass(extends = SuperClass)]
+struct SubClass {}
+
+#[test]
+#[should_panic]
+fn test_add_subclass_existing() {
+    pyo3::Python::with_gil(|py| {
+        PyClassInitializer::from(pyo3::Py::new(py, SuperClass { from_rust: true }).unwrap())
+            .add_subclass(SubClass {});
+    })
+}

However, there is still an un-soundness because PyClassInitializer::new() is directly exposed.

ChayimFriedman2 commented 3 months ago

@alex Just like this issue can be used to write out of bounds, it can be used to write into frozen classes, and this is unsound.

ChayimFriedman2 commented 3 months ago

FWIW, here's a patch that expresses what I was describing:

Yes, this is how I understood you.

However, there is still an un-soundness because PyClassInitializer::new() is directly exposed.

It is possible to fix it too.

As I said, this can be handled at compile-time, which is better. Now that I understand why we need the conversion from Py (thanks to @davidhewitt's linked issue), I think this is the best way forward.

alex commented 3 months ago

Hmm, can you give an example of the frozen class issue, I think I follow, but I'm not positive.

ChayimFriedman2 commented 3 months ago

Ah, when trying to construct an example I realized it is not possible since any frozen violation will also be out of bounds.