PyO3 / pyo3

Rust bindings for the Python interpreter
https://pyo3.rs
Apache License 2.0
12.06k stars 744 forks source link

Added `as_super` methods to `PyRef` and `PyRefMut`. #4219

Closed JRRudy1 closed 3 months ago

JRRudy1 commented 4 months ago

Introduction

This PR simplifies the process of accessing superclasses by adding an as_super method to PyRef and PyRefMut that lets you convert &PyRef<T> to &PyRef<T::BaseType and &mut PyRefMut<T> to &mut PyRefMut<T::BaseType>. This acts as a hybrid between the existing as_ref and into_super methods, providing both the by-reference ergonomics of as_ref with the super-superclass access of into_super while unifying the concepts under a more intuitive/idiomatic pattern.

What's wrong with <PyRef<T> as AsRef<T::BaseType>>::as_ref?

As mentioned in the docs, calling as_ref can give you &T::BaseType but not the super-superclass and beyond. But on top of that, AsRef::as_ref is a general-purpose method used throughout the Rust language, and it is not intuitive that this is how you do the pyo3-specific task of referencing the base class. Sure, the existence of an AsRef<T::BaseType> impl is useful if you want a define a function that is generic over any type that can be used to get &T::BaseType, but this should not be the primary method for doing so (just like Into::into should not be the primary method for converting &str to String ).

But what about PyRef::into_super?

Yes, into_super can also be used to get the super-superclass and beyond, but it consumes the PyRef<T> by value which can be awkward in many cases. If you have obj: PyRef<T> and just want to call a quick method on the super-superclass, your only option is to do obj.into_super().into_super().some_method(), which consumes your PyRef<T>. Now if you want to keep using the child class, hopefully you still have the Bound<T> around to borrow another PyRef (and incur the run-time borrow-checking overhead of doing so). This PR would instead let you instead do obj.as_super().as_super().some_method() and then keep using the original PyRef<T>.

That said, there is certainly still a place for the into_super method; calling as_super requires someone to retain ownership of the PyRef<T>, so you could not return the &PyRef<T::BaseType> from a function; for this you would need to use into_super and return PyRef<T::BaseType> by value. But under this system, the as_super and into_super methods have distinct purposes and idiomatic names that most Rust users will immediately recognize as doing the same thing but by-reference vs. by-value.

Implementation

PyRef::as_super and PyRefMut::as_super methods

The implementations of the new methods added in this PR (PyRef::as_super, PyRefMut::as_super, and the private PyRefMut::downgrade helper function) could literally be replaced by calls to std::mem::transmute (or the equivalent pointer casts); however, I broke them up a bit to make the safety logic more clear. For example, the PyRef::as_super method calls downcast_unchecked::<U> on the inner Bound<T> instead of just going straight from PyRef<T> to PyRef<U>. The same was not done for PyRefMut::as_super, however, since it would required transmuting a shared reference &Bound<U> to a mutable reference &mut PyRefMut<U> which might be fine but feels sketchy.

AsRef and AndMut impls

A nice side effect of adding the purpose-built as_super methods was the ability to simplify the general-purpose AsRef<U> and AsMut<U> impls by simply re-borrowing the reference returned by the as_super call. This avoid the need for additional unsafe code in these impls. However, doing this for PyRefMut's AsRef impl required "downgrading" the PyRefMut into PyRef, which I encapsulated in a private PyRefMut::downgrade associated function.

Safety

The safety logic behind this PR is fairly simple, and can be summarized as the following:

  1. I added #[repr(transparent)] to the PyRef/PyRefMut structs such that they are guaranteed to have the same layout as the Bound<T> they wrap. By extension, PyRef<T> and PyRefMut<T> also have the same layout as each other.
  2. Bound<T> has the same layout as Bound<T::BaseType> and can be transmuted to it freely (this is essentially how downcast_unchecked works, after all).
  3. The new methods only involve references to PyRef/PyRefMut, so the differences in Drop impls are not relevant; transmuting PyRefMut<T> to PyRef<T> would be problematic since dropping it would call release_borrow instead of release_borrow_mut, but dropping references does not call drop so transmuting &PyRefMut<T> to &PyRef<T> is fine.

As a result, the following pointer cast chains should be sound:

JRRudy1 commented 3 months ago

Thanks for the review David! I made the changes as requested. I definitely like your approach of ptr_from_ref and .cast over as casts. I just have a couple question about the new implementation:

A couple other thoughts:

JRRudy1 commented 3 months ago

even if there won't be a stdlib ptr::from_mut function for a while.

Actually it seems like ptr::from_ref and ptr::from_mut are being stabilized together, so I definitely think it makes sense to add a ptr_from_mut function alongside ptr_from_ref unless I'm missing something.

davidhewitt commented 3 months ago

Great questions!

* How do you feel about the way I split up the casts and used explicit types, such as in the `PyRef::as_super` method? I personally prefer being explicit in these cases to help justify the safety, but I could also see the value in being concise by using a single `.cast` with implicit types.

I went back and forth over this, I see upsides to both. Given the runtime cost should be the same I'm happy to keep what you already have here.

* Do you see any value in making `PyRefMut::downgrade` public? I suppose it could be useful in some niche cases.

Sure thing, we can expose it. Maybe as a follow up PR?

* IMO the User Guide should be updated to suggest `as_super` (alongside `into_super`) instead of `as_ref` for getting the base class. If you agree, I can make the change in another PR (or this one, whichever you prefer).

It would be great to update the documentation in this PR, yes please.

* I like your `ptr_from_ref` approach with the ultimate goal of using `ptr::from_ref` when the MSRV allows. But I wonder if it would be worthwhile to add an analogous `ptr_from_mut` function for the `&mut T` -> `*mut T` conversion as well, even if there won't be a stdlib `ptr::from_mut` function for a while.

I'd skipped ptr_from_mut purely because I wanted to see what fellow maintainers thought with just one part of the work initially. I'd support adding it in this PR and using it in the PyRefMut implementation πŸ‘

JRRudy1 commented 3 months ago

I'll work on fixing the checks, but in the meantime:

It would be great to update the documentation in this PR, yes please.

I updated the docs, as well as the example code that follows it. Let me know if you have any concerns about the wording.

I'd support adding it in this PR and using it in the PyRefMut implementation πŸ‘

I added this as well. A couple notes however:

davidhewitt commented 3 months ago
  • On that note... the upcoming ptr::from_ref doesn't actually appear to be const either. If you want to replace ptr_from_ref with ptr::from_ref when MSRV allows, then maybe ptr_from_ref should also not be const. Otherwise code could be written that relies on its const-ness, which would break when making the change.

Just following up here, it looks like ptr::from_ref and ptr::from_mut are const - https://doc.rust-lang.org/stable/std/ptr/fn.from_mut.html

So no changes needed in PyO3! πŸ˜„

JRRudy1 commented 3 months ago

Just following up here, it looks like ptr::from_ref and ptr::from_mut are const - https://doc.rust-lang.org/stable/std/ptr/fn.from_mut.html

So no changes needed in PyO3! πŸ˜„

Well I guess I'm losing it.... I swear I saw it not being const πŸ˜… maybe I was looking at an earlier draft of it or something.

But that's good to hear, I was very surprised when I thought it wasn't.