Is transmuting `&’a &’b T` to `&’a &’b mut T` sound or UB?

RustyYato commented 3 years ago

Crossposting from https://users.rust-lang.org/t/is-transmuting-a-b-t-to-a-b-mut-t-sound-or-ub/54070

@steffahn writes

I just came across wondering about this (here 3) and I don’t know where to find an accurate answer.

The question is whether this is sound
pub fn some_casting_function<'a, 'b, T>(x: &'a &'b T) -> &'a &'b mut T {
    unsafe {std::mem::transmute(x)}
}
I think the only thing I’m uncertain about is whether such a function would be instant UB 1.

Pros: You can’t really mutate anything through a &&mut T that you wouldn’t be able to mutate through a &&T as well (e.g. through interior mutability). And a &&mut T is not unique.

Cons: There might be some kind of possibility for misoptimization that I’m missing, or just some kind of language specification saying that it isn’t allowed. Getting creative here: the compiler might assume that for x: &&mut T and y: &&mut T holds: if (x as *const _ != y as *const), then (*x as *const T != *y as *const T).

Also, there could be API’s that only hand out &T, or non-unique Rc<T> or whatever (T: !Clone) so that you wouldn’t be able to get you hands on a value of type &&mut T any other way. And such a library could, potentially, for whatever reason, rely on this...

Another Pros (counterargument to the second Cons): Libraries like take_mut do, too, allow for getting types, e.g. a T by value, from APIs where that might otherwise be impossible, e.g. it only hands out &mut T references and T: !Clone + !Default, etc..

I’d also appreciate e.g. links to previous discussions or existing libraries if you know of any.

digama0 commented 3 years ago

It's not an exact answer to the question, but while investigating this I found that a much more obvious principle, &'a &'b mut T -> &'a &'b T, is unsound:

fn unmut<'a, 'b, T>(x: &'a &'b mut T) -> &'a &'b T {
    unsafe { std::mem::transmute(x) }
}

fn dup<'a, T>(x: &'a mut T) -> (&'a mut T, &'a T) {
    fn pair<A, B>(x: A, y: B) -> (A, B) { (x, y) }
    pair(x, unmut(&x))
}

fn main() {
    let mut vec = Vec::with_capacity(1);
    vec.push(1);
    let (vec, vecref) = dup(&mut vec);
    let ptr = &vecref[0];
    vec.push(2);
    println!("{}", ptr);
}

On review of the RustBelt paper, this seems to be predicted, in that &'a mut T is not a subtype of &'a T (syntactically or semantically), but it makes me even more suspicious of the reverse principle.

The fact that I had to use the pair function above (an actual pair doesn't work without the indirection) suggests that there is something off in the borrow checker implementation here.

RustyYato commented 3 years ago

This seems to be more clear because &T: Copy. However &mut T: !Copy, so it isn't so clear that &&T -> &&mut T is unsound.

btw: you can implement dup like so without the pair function

fn dup<'a, T>(x: &'a mut T) -> (&'a mut T, &'a T) {
    let y: &&'a T = unmut(&x);
    let y = *y;
    (x, y)
}

digama0 commented 3 years ago

I worked it out on paper, not in Coq, but from my reading of the RustBelt logical rules, it is in fact derivable that [&T].shr(t,v) => [&mut T].shr(t,v), so in fact it is admissible for there to be a subtyping relation, not merely a function, from &'a &'b T to &'a &'b mut T.

chorman0773 commented 3 years ago

There are two questions here:

Is it UB, and
Is it sound. If the answer to the first question is yes, then it would not be sound, but it can be unsound without being UB.

For the first question, I'd hestitate to say no, but I wouldn't necessarily say yes. I assume it would come down to the propagation of aliasing invariants through references. #77 discusses whether or not validity invariants propagate through references. I would assume that if aliasing invariants propagate, then validity invariants propagate (but not necessarily the converse). If the current disposition is the one chosen, that would make the transmute not inherently UB based on the assumption. As for the soundness question, a mutable reference behind a shared reference can be used as a shared reference only, so I would presume that soundness would follow from having defined behaviour.

digama0 commented 3 years ago

My previous post argues that this operation upholds the rust safety invariants, meaning that as long as the &'a &'b T was obtained through safe code (or sound unsafe code), then it can be transmuted to a &'a &'b mut T and the safety invariants continue to hold. None of the discussions about validity invariants affect this (although if the RustBelt model needs to be modified then that might). In short, this function is sound. In particular, it's not UB if you give it safe input.

Validity invariants would come into play if unsafe code wants to pass an invalid-but-not-immediate-UB &'a &'b T to the function and hope that the result is sensible. Sound functions in general may or may not preserve validity invariants. In this case, I think it would be fairly difficult to define the validity invariants in a way that would make this fail - if validity is "shallow" then it's fine since everything interesting is happening behind a couple pointers, and even if it is deep the aliasing restrictions are the same in both cases, exactly because the types basically have identical semantics except for the inability to copy out of the target type.

chorman0773 commented 3 years ago

Validity invariants would come into play if unsafe code wants to pass an invalid-but-not-immediate-UB &'a &'b T to the function and hope that the result is sensible

The thing I brought up is if aliasing invariants propagate, in which case, when you transmute to &'a &'b mut T, it would "leak" the &'b mut T, in which case it could be a similar problem to transmuting &'b T to &'b mut T, which is always immediate UB.

Sound functions in general may or may not preserve validity invariants

Sound functions have to preserve validity invariants, as violating them is immediate UB (at least when you perform a typed copy, #84). If validity is transitive through a reference, then it wouldn't be far off to consider that aliasing requirements are as well.

Safety invariants are the things that do not necessarily have to be perserved, but it's not sound to expose a type with broken safety invariants to arbitrary safe code. It's not UB to do so, but it's not sound.

digama0 commented 3 years ago

A sound function satisfies safe(input) -> safe(output). Safety implies validity, safe(x) -> valid(x), meaning that if the input is safe then the output is valid (and also not UB), but if the input is valid then the output may not be valid (and UB might happen). I'm not sure what's a good example of something that is valid but not safe, let's say a string which is not valid UTF-8. Here's a function that is safe (sound) but does not preserve validity:

fn foo(s: String) {
    if s.as_bytes() == [0xFF] {
        unsafe { std::hint::unreachable_unchecked() }
    }
}

As long as you pass this function a string that was "legally obtained", i.e. from safe code or otherwise satisfying the safety invariants, nothing bad happens. But if you construct an invalid string, although that's not immediate UB, if you call this function you can cause UB.

If validity is transitive through a reference, then it wouldn't be far off to consider that aliasing requirements are as well.

This is interesting. I guess the invariant that might be violated here is "this pointer points to a value that is a pointer that is the only pointer to T". But that doesn't really make sense, because lots of other pointers exist; it's really a statement about which pointers are valid to write through, and the pointer that x points to isn't valid to write through, because it has an active shared borrow on it. And when borrow 'a ends, you still can't use the &'b mut T pointer because that pointer doesn't exist, it's actually a &'b T pointer and so you still can't write through it. In principle Stacked Borrows should have something to say about this, because it's dealing with the dynamics of unsafe code, but there isn't really anything going on here in terms of operations - all the invariants only get asserted when you read or write in SB.

steffahn commented 3 years ago

@digama0 What’s your take on my point of

the compiler might assume that for x: &&mut T and y: &&mut T holds: if (x as *const _ != y as *const _), then (*x as *const T != *y as *const T).

Or to give some example code:

fn foo<T>(x: &&mut T, y: &&mut T) {
    assert!(std::mem::size_of::<T>() > 0);
    if (x as *const &mut T != y as *const &mut T) && (*x as *const T == *y as *const T) {
        // Could the compiler assume that this point is unreachable?
    }
}

the intuition being:

The compiler might assume, that the code section indicated above is unreachable because *x is the only mutable reference to **x at this point. Other immutable references to **x may of course exist, since e.g. x can e.g. be coerced into &T. Since *y is a different mutable reference (x as *const &mut T != y as *const &mut T), it shouldn’t be able to point to the same T.

Some further thoughts

Even if the compiler doesn’t currently and will never assume such a thing, there’s still the possibility that unsafe code in the standard library or some other library (somehow) relies on the fact that (x as *const &mut T != y as *const &mut T) && (*x as *const T == *y as *const T) is always false.

What I mean is that

pub fn library_assumptions<T>(x: &&mut T, y: &&mut T) {
    assert!(std::mem::size_of::<T>() > 0);
    if (x as *const &mut T != y as *const &mut T) && (*x as *const T == *y as *const T) {
        unsafe {std::hint::unreachable_unchecked()}
    }
}

and

pub fn some_casting_function<'a, 'b, T>(x: &'a &'b T) -> &'a &'b mut T {
    unsafe {std::mem::transmute(x)}
}

could both be “sound” on their own, but together they allow

fn main() {
    let x = 0_u8;
    let a = &x;
    let b = &x;
    let a_ = some_casting_function(&a);
    let b_ = some_casting_function(&b);
    library_assumptions(a_, b_);
}

timeout: the monitored command dumped core
/playground/tools/entrypoint.sh: line 11:     7 Illegal instruction     timeout --signal=KILL ${timeout} "$@"

(playground)

In other terms, the question whether transmuting &'a &'b T to &'a &'b mut T is sound might be independent of the “axioms” that are the Rust semantics. As long as the standard library (logically containing the &mut type) doesn’t decide which one of library_assumptions and some_casting_function is considered sound and as long as both really don’t violate any inherent safety contracts of the Rust language, both should probably still be considered unsound.

chorman0773 commented 3 years ago

In principle Stacked Borrows should have something to say about this, because it's dealing with the dynamics of unsafe code, but there isn't really anything going on here in terms of operations - all the invariants only get asserted when you read or write in SB.

From my understanding, the "aliasing invariant" for &mut is that it is a pointer that has a valid Unique item on the borrow stack. By transmuting a &T to a &mut T, you are creating a &mut T with a SharedReadOnly item (which either is immediate UB, or when the &mut T is immediately reborrowed, causes UB because of the SharedReadOnly). If aliasing invariants are transitive, the broken invariant would be the fact you are referencing a &mut T that only has SharedReadOnly access to it's referent. It's possible that your interpretation is also correct, in which case whether or not aliasing invariant are transitive would be moot, and the function would not be UB.

digama0 commented 3 years ago

@digama0 What’s your take on my point of

the compiler might assume that for x: &&mut T and y: &&mut T holds: if (x as *const _ != y as *const _), then (*x as *const T != *y as *const T).

@steffahn I can see why this principle is intuitive, and can be reasonably interpreted as either true or false, where if it's true then the original function is unsound. Using the RustBelt rules as a source of truth for "semantic soundness", so that we can actually evaluate this claim, this principle comes up false, and the reason is because the sharing predicate of &mut T doesn't say anything resembling unique ownership, it's basically the same things as the sharing predicate for &T.

This is a little subtle, but it ties in to one of the core ideas of the paper: every type gets to define what it means to take a shared reference to it. The main reason for this behavior is for internal mutability; &Cell<T> can't just be a shared read-only reference to Cell<T> because that would preclude writing to the cell. The way &mut T defines sharing is as if it were first downgraded to a shared reference, as if you reborrowed it, before forming a pointer to the result. That is, if x: &mut T then the type of &x is essentially just &&T.

So even though it syntactically has a &mut in the type, there is no longer any hint of that underlying &mut T - it's not true that *x: &mut T because that would be asserting unique ownership, and rust will complain about moving a non-copy if you do that. All the operations conspire to make that inner pointer not distinguishable in any way from *x: &T, which is why the semantics are able to say that it is pointing to a &T.

But the RustBelt semantics aren't the only possible semantics you could give that validate the rules of the type system. A stricter semantics could validate your principle, likely by making the shared predicate for &mut T have more of a hint that there was once a unique reference here (although the exact details of such a design elude me), because with the strictest definition - the only safe things are the ones in the safe type system - I think the principle is true.

The tricky part with designing a semantics like this is making sure that it also upholds the safety guarantees of unsafe code that is encapsulated in a safe interface, like Cell or Mutex. If the semantics decide that most unsafe code is unsound then it's probably missed the mark. It's for this reason that I would rather stick to a well-tested semantics like RustBelt which has already been vetted against much of the standard library rather than make a tweak somewhere in the middle and hope I don't break anything.

In other terms, the question whether transmuting &'a &'b T to &'a &'b mut T is sound might be independent of the “axioms” that are the Rust semantics.

I think I've already said this above in other words, but in the type system, the transmute is not allowed (i.e. it's not literally safe, you have to use unsafe to write it), and with RustBelt it is allowed, but there is a whole spectrum of possible semantics given the constraints "safe code is safe" and "popular unsafe encapsulated libraries are safe" (and the latter is underspecified and also assumes that unsafe code authors are perfect which might be too hard line). So it is independent if you take those as your axioms (but those axioms might also be inconsistent if there is a bad library out there).

@chorman0773 I think the upshot of all this for stacked borrows is that even if "the aliasing invariant is transitive", you aren't actually getting the aliasing invariant of &mut T, you are getting the aliasing invariant of &T, when you traverse the type structure, because putting a shared reference does a kind of rewrite on the whole type turning all the &mut to &; that is you are really looking at a type like&(&mut T).shr = &&T.shr where T.shr is the part of the rewrite that depends on what T is.

More concretely, the aliasing invariant for x: &&mut T says that x has a SharedReadOnly (or Unique) tag on the stack, and *x also has a SharedReadOnly (or Unique) tag on the stack, and **x satisfies the aliasing invariant for a shared T. In other words, the invariant is indistinguishable from the invariant for &&T. With this scheme, the transmute is trivial because the before and after invariants are the same. You aren't "transmuting a &T to a &mut T", that would definitely be UB. The outer & makes all the difference.

Diggsey commented 3 years ago

Regarding @digama0's example, it seems to me like the unsafety comes from extending the lifetime of the reference to 'b.

ie. instead of:

fn unmut<'a, 'b, T>(x: &'a &'b mut T) -> &'a &'b T

The following would be safe:

fn unmut<'a, 'b, T>(x: &'a &'b mut T) -> &'a &'a T

This is intuitive from the fact that once the shared borrow of the mutable reference expires, it becomes mutable again, and so it would be UB for there to be further accesses to T through a shared reference, even if the T itself is still alive.

This is also in-line with:

The way &mut T defines sharing is as if it were first downgraded to a shared reference, as if you reborrowed it, before forming a pointer to the result.

During the "reborrow" step, the lifetime will get shortened to that of the outer reference.

digama0 commented 3 years ago

This is a good point. Indeed, I've been hedging about the types of &&mut T and &&T being the same in RustBelt, and that's because the lifetimes are different; in &'a &'b mut T the sharing predicate [&'b mut T].shr('a, t, v) has to assert that the inner reference is alive only at the intersection of 'a and 'b; of course in a normal rust program the fact that 'b outlives 'a will be floating around and so this is just the same as lifetime 'a. In other words, &'a &'b mut T is indeed just &'a &'a T.

RalfJung commented 3 years ago

Is transmuting &’a &’b T to &’a &’b mut T sound or UB?

FWIW, if we ignore Stacked Borrows, we even have a proof of soundness for this in RustBelt. (So @digama0's analysis at the beginning of this thread is right.) But of course this assumes the RustBelt semantic model. That said, there is already code out there which assumes this to be correct (code like this, dubbed "Abomonation"), and the RustBelt model was designed the way it is specifically to support that kind of code.

I had not thought about the ptr-equality invariant @steffahn proposed before, and it is very interesting. However, the fact that it has to use ptr equality also makes it suspicious. Aliasing and pointer equality are only loosely related, as is already apparent from the fact that you have to exclude ZST. (Another piece of evidence for this is that LLVM noalias/C restrict can not be used to optimize pointer equalities, it can only be used to reorder memory accesses.) Furthermore, it is hard to imagine this invariant being useful. So if we have to choose between Abomonation and this, I have a strong preference for Abomonation.

The other interesting question is: what about the aliasing model? Currently, RustBelt does not incorporate Stacked Borrows, so there is no easy answer for this (unless someone finds a way to make Miri raise an aliasing error when using this transmute). But at some point we'll hopefully define a semantic model of Rust that accounts for the aliasing model, and then again I would view it as a design constraint that this transmute should be allowed, and I would choose the model accordingly. In particular, the "shared invariant" of a mutable reference should permit the corresponding borrow stack item to be a Shared*.

steffahn commented 3 years ago

So if transmuting &'a &'b mut T ⟷ &'a &'a T is sound, the same applies to &'a Box<T> ⟷ &'a &'a T

and more complex things like &'a &'a [Option<&'a T>; N] ⟷ &'a Box<[Option<&'b mut [T; 1]>; N]>, right?

RustyYato commented 3 years ago

So if transmuting &'a &'b mut T ⟷ &'a &'a T is sound, the same applies to &'a Box<T> ⟷ &'a &'a T

~~Both of these are only sound in one direction. From shared to exclusive/box.~~

Edit: I need to not misread things

steffahn commented 3 years ago

Both of these are only sound in one direction. From shared to exclusive/box.

@RustyYato, I thought we agreed on

instead of:
fn unmut<'a, 'b, T>(x: &'a &'b mut T) -> &'a &'b T
The following would be safe:
fn unmut<'a, 'b, T>(x: &'a &'b mut T) -> &'a &'a T
https://github.com/rust-lang/unsafe-code-guidelines/issues/270#issuecomment-761278860

chorman0773 commented 2 years ago

The question of soundness can be reduced to a simple question, actually. Is it defined behaviour to reborrow a &T out of the &&mut T. That is, can the program retag as SharedReadOnly, despite the tag behind the first reference not being the Unique required by &mut T, but merely being SharedReadOnly.

My initial thought is yes, and according to miri currently, it is yes. The question of whether miri is correct, or it's a false negative, should be the remaining question: The follow code passes miri today w/o -Zmiri-tag-raw-pointers (though I don't see a reason that -Zmiri-tag-raw-pointers would affect the result, as no raw pointers are used):

let x = 0;
let y = &x;
let p = unsafe{core::mem::transmute::<&&i32,&&mut i32>(&y)};
println!("{}",&**p);

JakobDegen commented 2 years ago

With respect to the above comment, rust-lang/rust#96116 now gives us the option to easily make this UB if we want it to be. (Had originally posted there, but it was sort of off-topic there anyway)

thomcc commented 2 years ago

I'd really like to be able to go from &[&str] <-> &[Box<str>]. This allows Box<str> to avoid the problems that the &str/String dichotemy have, e.g. "you should take &str everywhere" kinda fails in cases where you need a &[String] -- Not so if your owned string type is Box<str>.

This is obviously slightly different, but if there's no strong reason to make it UB, it would be nice to keep well defined. (I don't do this anywhere now, mostly because I believe this to be on-the-fence WRT being well-defined).

(Strictly speaking, I suppose only the Box -> str version needs to work, though)

RalfJung commented 2 years ago

I worked hard in RustBelt to make sure that we could prove soundness of those transmutes. It would be a shame if the aliasing model destroys that, since I don't think we can gain anything from it...

Why does that pass 'help' make this UB? I don't quite understand what is meant by 'derefer'.

RalfJung commented 2 years ago

I guess the &**p becomes two statements? However, I think it would be a bug to have *p and treat it like a regular ol' mutable reference and retag it -- we don't actually exclusively own that reference. If anything we might want to treat it as a shared reference (all references below shared references are implicitly shared).

JakobDegen commented 2 years ago

Yeah, we essentially turn

let _r = &**p;

into

let temp = *p;
let _r = &*p;

and then the retag introduced on temp will cause UB. I have not actually tested this though (I may be able to do that tomorrow, not sure). If this is indeed something we don't want, this seems worth discussing now, to make sure that any other planned changes are compatible with this

RalfJung commented 2 years ago

Makes sense. Indeed I think we do not want this to be UB. It'd be pretty bad if that pass introduced UB into a UB-free program...

Conceptually, giving temp the type &mut T is also not quite right -- we don't actually own it at that type there. The type &T would be more accurate.

chorman0773 commented 2 years ago

Yeah, we essentially turn

let _r = &**p; into

let temp = p; let _r = &p; and then the retag introduced on temp will cause UB. I have not actually tested this though (I may be able to do that tomorrow, not sure). If this is indeed something we don't want, this seems worth discussing now, to make sure that any other planned changes are compatible with this

I wonder if that could add UB to other things, by retagging unique.

Edit: Took me all of two seconds to come up with a trivial example with safe code: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c260cde29b2b85dd64d3fae5c6901e08.

If the copy out of p into temp is of type &mut T, this would add a retag as unique when initializing s which invalidates r. However, r is not statically invalidated, so nothing could be introduced that dynamically invalidates it.

bjorn3 commented 2 years ago

That extra temporary variable could be very large. What is the exact reason you are introducing the derefer pass?

digama0 commented 2 years ago

The temporary there should have type &T (or &mut T with some lying about uniqueness), so it should not be large.

I don't see a reason for the pass however, since &**p is already only one "actual" operation - *p reads p: &&T to produce a &T place equivalent to p, and **p dereferences the pointer to get a T place, and &**p converts the place back to a pointer of type &T. The first and last operations don't actually do anything other than convert between place and value representations; only the middle dereference is actually doing a pointer load (and the first operation is accessing p).

JakobDegen commented 2 years ago

I believe the intent of this pass is to require that a Deref projection be the first one if it is present at all. This is actually really useful, because it makes things easier on codegen since that's closer to LLVM GEP, and it's also much easier to reason about. Right now there's weird stuff where the LHS of an assignment can actually read from arbitrary memory by doing two derefs

Jules-Bertholet commented 4 months ago

A related question: is the following function sound?

use std::{marker::PhantomData, mem::MaybeUninit, process::abort, ptr};

pub struct NoSendSync<T> {
    pub value: T,
    pub phantom: PhantomData<*const ()>,
}

pub fn strange<T>(param: &&mut NoSendSync<T>) -> ! {
    unsafe {
        let inner_mut: &mut NoSendSync<T> = ptr::read(param);
        let inner_mut: *mut NoSendSync<T> = inner_mut;
        let inner_mut: *mut MaybeUninit<NoSendSync<T>> = inner_mut.cast();
        *inner_mut = MaybeUninit::zeroed();
        abort();
    }
}

RalfJung commented 4 months ago

How's that related? This seems to be asking the same question as https://github.com/rust-lang/unsafe-code-guidelines/issues/84, I don't see much of a relation to the current issue.

RalfJung commented 4 months ago

Oh maybe I see what you mean.

In terms of its safety invariant, & makes things read-only and applies fully recursively (until it hits an UnsafeCell). So no you can't just write to an &&mut NoSendSync<T> as you don't have the permissions to write the inner reference.

rust-lang / unsafe-code-guidelines

Is transmuting `&’a &’b T` to `&’a &’b mut T` sound or UB? #270

Some further thoughts