Safe coercions for references of repr(transparent) types

clarfonthey commented 3 years ago

Idea: types which are repr(transparent) whose can be constructed in the current scope should be coercible via as. Going to post this here because I don't really have the time to write an RFC or file an MCP, but this is something I've thought for a while and haven't really seen concretely mentioned anywhere.

Example:

mod scope {
    #[repr(transparent)]
    pub struct Wrapper<T>(T);
    impl<T> Wrapper<T> {
        pub fn wrap(val: &T) -> &Wrapper<T> {
            val as &Wrapper<T> // okay because constructor `Wrapper` is accessible here
        }
        pub fn wrap_slice(slice: &[T]) -> &[Wrapper<T>] {
            val as &[Wrapper<T>] // also okay to coerce inside types
        }
    }
}

fn cant_wrap<T>(val: &T) -> &Wrapper<T> {
    val as &Wrapper<T> // will fail because constructor `Wrapper` isn't accessible here
}

Essentially, right now, this is defined to not invoke undefined behaviour, ever, due to the presence of repr(transparent). However, there is no way to actually do this with safe code. The closest you can do is pointer casts, which ultimately still require an unsafe block to convert the pointers back into valid references, or transmutes, which are inherently unsafe.

I know that there are plans to make safe transmutation work, but I feel like this kind of coercion would also be nice to have.

Lokathor commented 3 years ago

This can basically be done via Trait, other than the part where the as keyword is used.

And we shouldn't be adding more uses of as to the language, we should be adding ways to get away from the as keyword.

clarfonthey commented 3 years ago

That exact trait you mention involves unsafe code, which is why I bring this up. There should be some way to do this with safe code only.

I disagree that we should be moving away from the as operator in all cases; this is a perfectly good use case for it IMHO. How would you suggest a trait to work in all cases, and also manage coercions inside types? The alternative would be to make the coercion automatic, which I don't think is a good idea.

scottmcm commented 3 years ago

One use of this that might make syntax more justifiable was if it was privacy-aware. That's not coming any time soon for traits in general, but it could perhaps be safe to go from &mut u32 <-> &mut Even<u32> with as in only the scopes where the type's constructor is visible.

clarfonthey commented 3 years ago

That's basically what I was proposing :p

burdges commented 3 years ago

There are situations where you want casts of Arc<TransparentWrapper<[T]>> to Arc<[T]> too, which came up in https://github.com/rust-lang/rust/pull/80505

andersk commented 3 years ago

Casting between Container<T> and Container<TransparentWrapper<T>> cannot possibly be valid in general, due to associated types (playground):

use std::mem::transmute;

trait Trait {
    type Associated;
}

struct Foo;
impl Trait for Foo {
    type Associated = &'static usize;
}

#[repr(transparent)]
struct TransparentWrapper<T>(T);
impl<T> Trait for TransparentWrapper<T> {
    type Associated = &'static String;
}

struct Container<T: Trait>(T::Associated);

fn main() {
    let container = Container::<Foo>(&0);
    let transmuted =
        unsafe { transmute::<Container<Foo>, Container<TransparentWrapper<Foo>>>(container) };
    let string: &String = transmuted.0;
    println!("{}", string); // boom
}

burdges commented 3 years ago

Yes, but the question is about specific containers.

andersk commented 3 years ago

Okay; is there a proposal for which specific containers? The original example just says // also okay to coerce inside types.

Here’s another potentially bad example without associated types: casting between BinaryHeap<u32> and BinaryHeap<Reverse<u32>> would break the heap invariant. It would not directly break memory safety, although you might imagine other unsafe code that relies on the heap invariant to guarantee memory safety of some other structure.

(I know you can already break the heap invariant for BinaryHeap<UserTypeWithWackyOrd>, but that doesn’t let you break it for BinaryHeap<u32> today.)

clarfonthey commented 3 years ago

As mentioned, the specific constraint is the visibility of the constructor. So, if you have struct S(u32), you can't coerce u32 to S outside that scope.

In the specific case you gave, you can't rely on the heap invariant in unsafe code since it's a safe trait.

I think it would be reasonably easy to constrain things to prevent the associated type example you gave but I have a headache right now and can't think of a specific example.

andersk commented 3 years ago

In the case of BinaryHeap<Reverse<u32>>, the Reverse constructor is visible, and even if it weren’t, you can re-implement your own copy of Reverse.

Again, I know you can already break the heap invariant for BinaryHeap<UserTypeWithWackyOrd>, but that doesn’t let you break it for BinaryHeap<u32> today. It should be fine for unsafe code to rely on the heap invariant for BinaryHeap<u32> for safety, but your proposal allows safe code to break it.

ruifengx commented 3 years ago

This sounds like Coercible and type role in Haskell. I believe we can just copy-paste the semantics of these two into Rust, and just focus on the concrete syntax.

The privacy issue is addressed by only allowing the compiler to resolve Coercible constraints if the type constructor is in scope and at the point of coercion the field of the #[repr(transparent)] struct in discussion is accessible. And we should make implementation of Coercible unsafe or forbidden (for better protection).

The safety issue is addressed by the three roles of type parameters:

nominal: this type parameter should be exactly the same for the resulting type to be Coercible, useful e.g. when depending on associated types of this type parameter;
representational: this type parameter should have the same representation (i.e. Coercible) for the resulting type to be Coercible, useful when we merely store the value and do not care about the associated types nor its invariant;
phantom: this type parameter may differ, but the resulting type would always be Coercible, useful when we store no real data of this type (AFAIK currently the only possible way is using PhantomData);

For complete safety, we may want to default type roles to nominal if not marked explicitly (thus protect against coercion between BTreeSet<u32> and BTreeSet<MyFancyTypeWithSpecialOrdering> etc., and preserve the current semantics by not allowing coercions between different instantiations of a generic type by default), and if necessary we can always overcome this restriction (if some container type does not yet have their type roles marked explicitly) by opt-in to unsafe. Explicitly marking type roles should be encouraged, though, and perhaps we may want to have an allow-by-default lint for it.

The initial concern of this post (i.e. safe coercions for reference types) can be formalised in this framework by considering reference types as generic types with a single representational parameter. We already allow coercions between raw pointer types, which can also be formalised by considering raw pointers as generic types with a single phantom parameter.

ruifengx commented 3 years ago

It just occurred to me that Rust types might have Drop issues. Types with non-trivial Drop might need to be excluded when resolving Coercible constraints, or we may accidentally skip custom Drop code without an explicit std::mem::forget or ManuallyDrop.

clarfonthey commented 3 years ago

Aren't there already ways that forget can be done without those explicit calls? That doesn't violate safety, since they can call forget safely anyway.

ruifengx commented 3 years ago

@clarfonthey Sure there are std::mem::forget and ManuallyDrop, but they are kind of explicit. What I intended to say was that (1) custom Drop usually means maintaining special invariant, and (2) a coercion looks not so explicit as std::mem::forget or ManuallyDrop::new, especially when wildcard is used in the cast as in x as _.

Even if types with custom Drop is excluded from being Coercible, such coercions are still possible via ManuallyDrop. Consider the following example, where we want to coerce x: A to B, where A has a non-trivial implementation for Drop:

a plain x as B would be denied;
but since ManuallyDrop is #[repr(transparent)], ManuallyDrop::new(x) as B would do the trick;
if B also has a non-trivial Drop, (ManuallyDrop::new(x) as ManuallyDrop<B>).into_inner() would do the trick;

I believe the intention of avoiding Drop calls should be expressed explicitly, and it should be possible to optimise out the overhead of ManuallyDrop::new since it is already #[inline(always)].

Anyway, this is more of a personal taste, if disabling Coercible for types with custom Drop is not desirable, at least it might be helpful to have an allow-by-default lint.

anderspapitto commented 7 months ago

@andersk fancy seeing you here

rust-lang / rfcs

Safe coercions for references of repr(transparent) types #3066