PyO3 / pyo3

Rust bindings for the Python interpreter
https://pyo3.rs
Apache License 2.0
12.06k stars 744 forks source link

Added `PyRefMap` and `PyRefMapMut` for borrowing data nested within a `PyRef`/`PyRefMut` #4203

Open JRRudy1 opened 4 months ago

JRRudy1 commented 4 months ago

This draft PR proposes a possible solution for issue #2300 and PR #4195, which involves adding a Ref::map-like mechanism for borrowing data nested within a PyRef or PyRefMut.

Unlike some of the approaches proposed in #2300 which attempt to create a PyRef<U> to some type U nested within a PyRef<T: PyClass>, this implementation instead defines a wrapper type that encapsulates the original PyRef alongside a raw pointer to U which it then dereferences to. This has the benefit of not relying on any of the pyo3 internals that may be in flux, and makes the safety relative easy to reason about since there is nothing magic going on.

Simplified version

In the simplest case of supporting only immutable borrows from immutable PyRef's, the approach boils down to this:

pub struct PyRefMap<'py, T: PyClass, U> {
    owner: PyRef<'py, T>,
    reference: *const U,
}

impl<'py, T: PyClass> PyRef<'py, T> {
    pub fn into_map<F, U>(self, f: F) -> PyRefMap<'py, T, U>
        where F: FnOnce(&T) -> &U 
    {
        PyRefMap {reference: f(&*self), owner: self}
    }
}

impl<'py, T: PyClass, U> Deref for PyRefMap<'py, T, U> {
    type Target = U;

    fn deref(&self) -> &U {
        // we own the `PyRef` that is guarding our (shared) access to `T`
        unsafe { &*self.reference }
    }
}

This can then be used like so:

let bound = Bound::new(py, MyClass{data: [0i32; 100]};
let data = bound.try_borrow()?.into_map(|c| &c.data);
assert_eq!(data[0], 0);

Full version

The above method should extend seamlessly to mutable borrows through PyRefMut, however this requires adding another wrapper type (e.g. PyRefMutMap) and all the associated impl's. Then another wrapper may be needed if you want to support shared borrows from a PyRefMut... this could be somewhat be avoided by making a single wrapper with some extra generic parameters, but then naming the types gets tedious. So, I ultimately decided to use a single base wrapper type PyRefMapBase<'py, U, Mut> where the contained PyRef<'py, T> or PyRefMut<'py, T> is stored as an opaque trait object Box<dyn OpaquePyRef<'py>> and the pointer is stored as NonNull<U> (which can be derived from either &U or &mut U). The mutability is then handled with the generic parameter Mut: Boolean which is used in type bounds to statically enforce that mutable dereferences only occur when derived from PyRefMut<T> and &mut U. Finally, two type aliases PyRefMap<'py, U> and PyRefMapMut<'py, U> are defined for the immutable and mutable cases, which are the actual names that users will reference (instead of PyRefMapBase directly).

This approach of Boxing the PyRef/PyRefMut has the added benefit of allowing the pyclass type T to be erased from the type name so that only the lifetime and the target type need to be included in signatures (i.e. instead of PyRefMap<'py, MyClass, [i32; 100]> you simply have PyRefMap<'py, [i32; 100]>. This comes at the cost of a pointer-sized Box allocation when the PyRef/PyRefMut is converted to a PyRefMap/PyRefMapMut, but I would imagine that is negligible in the context of interfacing with Python.

ToDo

mejrs commented 4 months ago

Also, the current implementation of pyclass and pyref/pyrefmut is fairly complicated. We'd be open to changes to it if that would make this simpler or more efficient.

davidhewitt commented 4 months ago

Thank you for working on this! I am 👍 on this feature although just wanted to give a heads up it might take me a few days to contribute to any review given it's a tricky one from soundness angle and it's not something I'd actively had in my head until you pushed the PR! 😂

JRRudy1 commented 4 months ago

No problem at all David! I know I've been spamming you with PR's this week 😉

However, I'd actually like to put this PR on a brief hold until the following PR's get resolved:

  1. PR #4219
  2. A new PR that I will submit after 4219 which involves a rework to PyRef and PyRefMut that should simplify things quite a bit, including the PyRefMap implementation proposed in this PR. It should also play nicely with a post-gil-refs transition to PyRef wrapping Borrowed<T> instead of Bound<T>, which I would be interested in working on when the time comes.