rust-lang / reference

The Rust Reference
https://doc.rust-lang.org/nightly/reference/
Apache License 2.0
1.25k stars 491 forks source link

Define raw pointer transmute behavior #1661

Open joshlf opened 1 month ago

joshlf commented 1 month ago

This is needed in order to support generic fat pointer casts in a const context as described here: https://github.com/google/zerocopy/issues/1967

traviscross commented 1 month ago

cc @rust-lang/opsem

RalfJung commented 1 month ago

The section does not really describe when the cast is well-defined, does it? The metadata must not change, i.e. there are constraints on what the old and new "unsized tail" types are.

RalfJung commented 1 month ago

For thin pointers this is uncontroversial, but for wide pointers this does constrain the layout in new ways so likely needs FCP by by some team(s). Cc @rust-lang/types @rust-lang/lang

Also, we already have casts that are not equivalent to a transmute: dyn trait upcasts. So just saying this for all well-formed casts is not correct. We should probably list explicitly when the transmute is allowed. I don't know what the right place is to state that.

traviscross commented 1 month ago

@rustbot author

@joshlf: You'll want to revise the proposed language based on @RalfJung's feedback above.

joshlf commented 1 month ago

@RalfJung Maybe you can give me advice on whether it's possible to phrase this formally. What I'd like to say is: "transmuting is equivalent to as casting so long as both pointers have the same metadata type". In particular this would support:

The reason that phrasing it in generic terms rather than explicitly enumerating the cases is that it permits generic unsafe code to be well-defined. In particular, we are working on a trait that looks something like:

/// # Safety
///
/// If `U: CastFrom<T>`, then given `t: *mut T`, `transmute::<_, *mut U>(t)` is sound and produces a pointer which
/// addresses the same number of bytes as `t`.
unsafe trait CastFrom<T: ?Sized> {}

This then allows us to write the following:

const fn cast<T: ?Sized, U: ?Sized + CastFrom<T>>(t: *mut T) -> *mut U {
    // NOTE: This can't be an `as` cast because Rust doesn't know that the vtable kinds match.
    unsafe { core::mem::transmute(t) }
}

We'd like to be able to implement CastFrom for container types over all possible T, e.g.:

unsafe impl<T: ?Sized> CastFrom<T> for ManuallyDrop<T> {}

This impl is only sound if we're able to add generic text to the effect of "transmuting is equivalent to as casting so long as both pointers have the same metadata type", but is not sound if we are only able to enumerate the cases specifically (since there's no way to prove that, for generic T: ?Sized, *mut T and *mut ManuallyDrop<T> satisfy one of the cases).

So my question is: Is there nomenclature already defined whose meaning matches what I'm aiming for here?

CAD97 commented 1 month ago

Is there nomenclature already defined whose meaning matches what I'm aiming for here?

The language to address each unsize kind separately exists (we differentiate between pointer casts that adjust the vtable and ones that don't), but I don't think there's established language to fully generally refer to compatible metadata. While I don't have a reference for official usage of the term, I've seen and used "unsize kind" plenty to generalize "vtable kind" to any pointee metadata.

What could potentially be said is that transmuting raw pointers will transmute the associated Pointee::Metadata, and leave it to DynMetadata to define when it's acceptable to transmute from DynMetadata<Trait + Bounds + A> to DynMetadata<Trait + Bounds + B>. More restrictive and I think sufficient for what you want is requiring identical metadata type, forbidding e.g. *mut dyn Trait + Send as *mut dyn Trait.

scottmcm commented 1 month ago

My instinct here (with my lang hat on but not speaking for the team) is that saying this for every fat pointer is premature, because we don't know what all potential fat pointers look like going forward.

Do you really need everything? Would slices be enough? Do you need dyn too?

One way forward that I think there's appetite for doing (see https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/Can.20we.20stabilize.20the.20layout.20of.20.26.5BT.5D.20and.20.26str.3F/near/395760337) is defining the layout for references- and pointers-to-slices, which would be another way to start allowing things like this.

But also, why start from asking about defining transmutes for this rather than asking for APIs that can do it, if the goal is just to go from *T to *U (as opposed to doing that in something nested)?

joshlf commented 1 month ago

Do you really need everything? Would slices be enough? Do you need dyn too?

The problem with not supporting everything is that then we can't support generic conversions (reproducing my example from https://github.com/rust-lang/reference/pull/1661#issuecomment-2432897285):

/// # Safety
///
/// If `U: CastFrom<T>`, then given `t: *mut T`, `transmute::<_, *mut U>(t)` is sound and produces
/// a pointer which addresses the same number of bytes as `t`.
unsafe trait CastFrom<T: ?Sized> {}

const fn cast<T: ?Sized, U: ?Sized + CastFrom<T>>(t: *mut T) -> *mut U {
    // NOTE: This can't be an `as` cast because Rust doesn't know that the vtable kinds match.
    unsafe { core::mem::transmute(t) }
}

unsafe impl<T: ?Sized> CastFrom<T> for ManuallyDrop<T> {}

In particular, the author of the unsafe impl block needs to know that transmuting *mut T into *mut ManuallyDrop<T> is sound for all T: ?Sized. Rust already permits this via as cast; the following compiles:

fn cast_to_manually_drop<T: ?Sized>(t: *mut T) -> *mut ManuallyDrop<T> {
    t as *mut _
}

So I'm not proposing to stabilize any new conversion behavior; I'm only proposing to stabilize that transmute is equivalent to as for such conversions.

But also, why start from asking about defining transmutes for this rather than asking for APIs that can do it, if the goal is just to go from *T to *U (as opposed to doing that in something nested)?

The problem is that I need to support an API like cast above, which needs to be a const fn. Note that cast takes two unrelated T and U - it is able to rely on T and U having compatible metadata thanks to U: CastFrom<T>, not based on anything that Rust has visibility into - in other words, t as *mut U would not compile. (I'm happy to go into more detail about why this is something we need if folks are curious, but I'll leave it at that for now.)

We can already support this today if we're willing to not support const fn; we can write something like:

unsafe trait CastFrom<T: ?Sized> {
    fn cast_from(t: *mut T) -> *mut Self;
}

The problem is that you can't call <U as CastFrom<T>>::cast_from in a const fn.

RalfJung commented 1 month ago

The current docs we have on this say that *T to *V is allowed when "T and V are compatible unsized types, e.g., both slices, both the same trait object".

That seems to be outdated? We allow dyn trait upcasting and discarding auto traits, don't we? Cc @rust-lang/types

traviscross commented 1 month ago

That seems to be outdated? We allow dyn trait upcasting and discarding auto traits, don't we?

We do allow dropping auto traits, and in Rust 1.84, we'll allow dropping the principal:

On dyn trait upcasting, we still need to restabilize that after having landed, in Rust 1.81: