rust-lang / rust-memory-model

Collecting examples and information to help design a memory model for Rust.
Apache License 2.0
126 stars 15 forks source link

Possible utility of 'unsafe lifetime in memory model #37

Open jpernst opened 7 years ago

jpernst commented 7 years ago

After consensus was reached that my 'unsafe lifetime RFC could have possible, unanticipated interactions with an eventual memory model, I got to thinking about what those interactions might be, and if they might even be beneficial in some way. I'm far from a domain expert on this kind of thing, but I feel like there's potential here to be explored at least.

One of the problems cited with the Tootsie-Pop model is that within the unsafe boundary we're forced to assume all references are potentially in an invalid, aliasing state and prevent any related optimizations for the duration. While this will cover a lot of common uses of unsafe, it might be overly strong wrt preventing optimization in general. There are also difficulties in defining exactly where to draw the line to define that invariants are now fully restored.

What I'm wondering now is if perhaps 'unsafe might play a role in defining that boundary. First, we start by assuming that all references are valid and correct for the duration of their stated lifetime, even in unsafe code. As pointed out in the Tootsie-Pop model, this can cause problems though:

impl [T] {
    pub fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) {
        let copy: &mut [T] = unsafe { &mut *(self as *mut _) }; 
        let left = &mut self[0..mid];
        let right = &mut copy[mid..];
        (left, right)
    }
}

Within the body of this function, the invariants of &mut are violated by the duplication of self. Let's make a slight change though:

impl [T] {
    pub fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) {
        let copy: &'unsafe mut [T] = unsafe { &mut *(self as *mut _) }; 
        let left = &mut self[0..mid];
        let right = &mut copy[mid..];
        (left, right)
    }
}

We add the 'unsafe lifetime to the duplicate reference. This has the effect of marking that reference as possibly violating invariants, and it's at this point that we're forced to forgo alias-based optimizations. In other words, we can think of &'unsafe T as a sort of "Medusa" reference, in that if we statically have line-of-sight to such a ref, the optimizer becomes "petrified". Once we've adjusted the slices and returned them, the Medusa ref goes out of scope, and the optimizer is unpetrified.

If a module contains a struct that has a private field of 'unsafe lifetime, then any function in that module that can access an instance of that struct has line-of-sight to that field and hence is petrified. Further PhantomData<&'unsafe ()> could be used to force such behavior if necessary. However, code that contains no such references, and is merely calling some FFI function, would not be petrified even though it is unsafe.

This idea still isn't fully formed, but I felt like it had some potential for exploration, so I'm posing it to the experts for possible discussion. There are open questions about should borrowing a pointer deref infer to 'unsafe lifetime? would doing so be backward compatible? Also, what impact might that have on existing unsafe code that makes use of pointer borrows.

RalfJung commented 7 years ago

I think currently the general consensus is that code like `split_at_mut´ is far too common to be forbidden. People write such code and expect it to work; if we violate that expectation, the situation will be much like the trouble C has with undefined behavior (where the expectations of programmers and compiler writers widely diverge).