rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.34k stars 12.59k forks source link

Arc::drop has a (potentially) dangling shared ref #55005

Closed RalfJung closed 2 years ago

RalfJung commented 5 years ago

Discovered by @Amanieu on IRLO. Quoting their report:

Arc::drop contains this code:

if self.inner().strong.fetch_sub(1, Release) != 1 {
    return;
}

Once the current thread (Thread A) has decremented the reference count, Thread B could come in and free the ArcInner.

The problem becomes apparent when you look at the implementation of fetch_sub:

pub fn fetch_sub(&self, val: $int_type, order: Ordering) -> $int_type {
    unsafe { atomic_sub(self.v.get(), val, order) }
    // HERE
}

Note the point marked HERE: at this point we have released our claim to the Arc (as in, decremented the count), which means that Thread B might have freed the ArcInner. However the &self still points to the strong reference count in the ArcInner -- so &self dangles.

Other instances of this:

RalfJung commented 5 years ago

Potential fixes:

RalfJung commented 5 years ago

Turns out Clang has a similar issue, it uses dereferenceable for C++ references and that's not correct because the pointee might get deallocated during the course of the function. Hence another possible fix might be to use dereferenceable_on_entry instead of dereferenceable for &UnsafeCell<T> once that attribute exists.

jClaireCodesStuff commented 5 years ago

It's a little spooky to see a situation where a region of code could do something undefined without an unsafe block. However, as the 'Nomocon notes, this is to be expected when working around unsafe code. A soundly designed library will encapsulate this potential unsafety, but it's not possible or even desirable to protect closely neighboring safe code from the potential of UB.

Anyway, in this twilight zone dangling pointers are not prohibited. Dereferencing dangling pointers is. Exposing somebody else's unsuspecting code to a dangling pointer is also very forbidden. But it's fine for dangling pointers to exist as long as they are dead.

Remember: "dead" means "will not be accessed by anything below this point." Other languages don't even have a concept of reference lifetimes.

Once the current thread (Thread A) has decremented the reference count, Thread B could come in and free the ArcInner.

That can happen. But if it does, Thread A is guaranteed to fetch a reference count greater than 1, which then causes Arc::drop to return early. self dangles but it isn't dereferenced while dangling, so no problem.

It would be undefined behavior for any method to be called after drop. But we also know that this won't happen, because "drop is the last method call" is one of the invariants.

Honestly, I think you're a little too attached to the "Stacked Borrows" model - this is a situation where it doesn't work and rather than trying to understand why the model should be changed, you're arguing that the source code of stdlib should be declared unsound. If Rust continues in the tradition of LLVM, the uniqueness of &mut really means something like "I solemnly swear that the addresses derived from this reference will not be derived by any other means until after the last use of this reference."

hanna-kruppe commented 5 years ago

The immediate problem is not about Rust, safe or unsafe, or any proposed formal semantics for it. rustc emits IR that has technically undefined behavior, because there is a pointer that is claimed to be dereferenceable for the duration of a function but points to memory that is (potentially) deallocated during the execution of the function. That this is not a Rust-specific problem can also be seen by the fact that Clang ('s C++ mode) is affected by it too, as linked earlier. This specific instance is not terribly likely to cause miscompilations, but others are, and in any case "UB that's unlikely to cause problems in practice" is not a good way to build a reliable language implementation.

Aside from that, there is the task of justifying the LLVM IR that rustc emits with Rust-level rules -- at minimum, in every case where emitted LLVM IR has UB, the input Rust code also needs to have UB to be able to claim rustc is correct. This issue points out a case where previous attempts at justifying the dereferenceable attribute that rustc emits are not sufficient. There are multiple possible remedies for that, none of which are about stacked borrows and most of which modify how rustc emits IR (emitting the attribute in fewer cases, or emitting a weaker attribute instead in some cases) to make the existing code well-defined without modification.

vertexclique commented 5 years ago

Option one:

  • Provide a version of fetch_sub that takes a raw pointer, and use that.

for resolving this bug was tried in here: https://github.com/rust-lang/rust/pull/58611

nikomatsakis commented 4 years ago

I'm nominating this issue -- and tagging it for @rust-lang/lang -- in response to some comments by @jamesmunns in today's Project Leadership Sync meeting. In particular, @jamesmunns pointed out that this issue around the deferenceable attribute has potentially large repercussions in the embedded space, and cited @japaric's RFC as well as a number of issues around the embedded space (see the minutes).

I've got to dig a bit deeper here but it seems like it'd be good to check-in on the current thinking here. I will try to catch up on the RFC and some of the proposed changes and see if I can post any notes.

Without having done so, I would say that, as far as I know, we still want to keep the dereferenceable attribute, so I expect some form of these changes will be required.

(I suppose omething like the [dereferencable on entry proposal(https://lists.llvm.org/pipermail/llvm-dev/2018-July/124555.html) might also be good, does anyone know if that's made progress? I'm going to assume not.)

vertexclique commented 4 years ago

@nikomatsakis I have done some work around this. But actually that extended visible API surface by one method. It resolves the problem at a cost of that.

64 commented 4 years ago

I could be wrong, but it seems like LLVM now supports nofree and nosync function attributes, which would solve the issue. https://lists.llvm.org/pipermail/llvm-dev/2018-July/124555.html

hanna-kruppe commented 4 years ago

nofree and nosync exist now, but as far as I can tell the switch from

dereferenceable applies through the entire function

to

dereferenceable applies on function entry (use nofree and nosync information to recover analysis power)

has not yet happened.

RalfJung commented 4 years ago

Ah so the tentative plan in LLVM now is to weaken dereferencable semantics instead of adding another attribute?

That seems bad though. Quoting:

I'd like to propose adding a nofree function attribute to indicate that a function does not, directly or indirectly, call a memory-deallocation function (e.g., free, C++'s operator delete).

So, e.g., a function that takes a reference and also drops an unrelated Box could not express, with its attributes, that the reference will not get deallocated.

Maybe some from the Rust side should chime in an raise this as a concern?

hanna-kruppe commented 4 years ago

I forgot about this before but there's a (not yet committed) patch introducing a new attribute for "dereferencable forever" at https://reviews.llvm.org/D61652

RalfJung commented 4 years ago

See https://github.com/rust-lang/rust/issues/65796 for another instance: there's a call to AtomicBool::store there signalling another thread that it can remove this memory; that deallocation might happen while AtomicBool::store still runs with its &self argument.

RalfJung commented 4 years ago

I forgot about this before but there's a (not yet committed) patch introducing a new attribute for "dereferencable forever" at https://reviews.llvm.org/D61652

Uh-oh, I just read over that and if I didn't miss anything that new attribute is useless for us. It means "this pointer value is dereferencable for the remaining execution of the program", which is not the case for references.

pnkfelix commented 4 years ago

@RalfJung my reading of the dialogue on https://reviews.llvm.org/D61652 is that LLVM still plans to eventually add no-free/no-sync that will (hopefully?) allow us to fix the normal deferenceable attribute, and semi-orthogonally to that they are also adding the deferenceable_globally attribute.

That is, the dereferencable_globally attribute is not replacing derefencable (perhaps the latter will even be alpha-renamed to dereferencable_locally?), so we should not need to raise a stink about it regressing peformance for us, no?

RalfJung commented 4 years ago

We can still emit dereferencable with its new semantics, true. But it is much weaker than what we do currently, and there is no replacement. We can't use no-free/no-sync as such global statements ("this function will not deallocate anything") are not things our type system provides to us.

nikomatsakis commented 4 years ago

On a related note, @cuviper recently tracked down some seg-faults in rayon and it came down to this same basic pattern. In particular, as @cuviper explains,

I think we have the same issue here:

impl<'a, L: Latch> Latch for TickleLatch<'a, L> {
    #[inline]
    fn set(&self) {
        self.inner.set();
        self.sleep.tickle(usize::MAX);
    }
}

The problem is that as soon as we set(), the waiting thread may actually proceed and destroy the latch. It may have already been awake (so it didn't need the tickle) or just happened to race awake for other reasons. The point is, we're not really representing lifetimes well in that &self reference, in that it can actually die before the end of this function, and then reading self.sleep is a use after "free" (on the stack).

This is definitely further evidence that this sort of pattern is pretty common "in the wild" and the kind of thing that folks are likely to write naturally. Note that all these cases embed the latch directly, although it is quite possible that the callers to latch.set() are holding a Arc<Latch> or &Latch and might themselves get freed after the call to set() as well.

RalfJung commented 4 years ago

@nikomatsakis just to be clear, the UB here did not come from dereferencable, but from there being actually further accesses after deallocation (namely, self.sleep)?

cuviper commented 4 years ago

@RalfJung correct, the segfault was a direct result of reading self.sleep after set() cleared the way for dropping this latch. But even if we move that read earlier, there may still be UB in &self remaining at all, like this issue. I think maybe the latch method should take *const Self instead, but even then there's ultimately an atomic &self call in there that will invalidate itself.

comex commented 4 years ago

I think maybe the latch method should take *const Self instead, but even then there's ultimately an atomic &self call in there that will invalidate itself.

Interesting. We could add a *const Self-taking alternative for every method on atomics, but that would be a lot of duplication, and it would be a footgun (i.e. unsafe code authors would accidentally use the &self version when they need the *const Self version).

Instead, how about adding an attribute to opt out of deferenceable – or from Stacked Borrows' perspective, 'protectors' – on specific function arguments? e.g.

fn fetch_add(#[unprotected] &self, val: i16, order: Ordering) -> i16 { ... }

It's not as if dereferenceable provides any optimization benefit on a trivial method that wraps a single memory-access intrinsic. LLVM knows the memory will be dereferenced when the method, well, explicitly dereferences it, and there are no operations beforehand: nowhere to put a speculative read that's separated from the actual read. I suppose, assuming the atomic method is inlined (which of course it should be), the compiler could hypothetically (very hypothetically) take advantage of dereferenceable to insert a speculative read immediately after the atomic operation, not for that operation's own sake but for the sake of some future access to the same memory location, later in the caller function. But the atomic instruction(s) would normally provide the value themselves without needing to add another load. Even if it somehow didn't, the compiler would usually be able to conclude that the object will stay alive based on the caller function's behavior.

Using an attribute would solve the duplication and the wrong-atomic-variant footgun. And the attribute could also be used by other code, like set() in the above example. On the other hand, it would be easy for unsafe code to forget to use the attribute.

CAD97 commented 4 years ago

#[unprotected]

That's #[may_dangle], right? Or at the very least, it's closely related. The problem is that &self may be dangling after the atomic operation but before the function returns (because the function itself is not atomic).


Are there any other primitives in the Rust standard library other than Atomic* that could be used for synchronization leading to a free? I worry that this might end up a recurring bug for synchronized types, and a frighteningly scary one.

At the very least, I'd like to see an argument for if the same issue applies to the atomic op in Arc::drop_slow/Weak::drop:

// Arc::drop
    fn drop(&mut self) {
        if self.inner().strong.fetch_sub(1, Release) != 1 {
            return;
        }

        atomic::fence(Acquire);

        unsafe {
            self.drop_slow();
        }
    }

// Arc::drop_slow
    unsafe fn drop_slow(&mut self) {
        ptr::drop_in_place(&mut self.ptr.as_mut().data);

        if self.inner().weak.fetch_sub(1, Release) == 1 {
            atomic::fence(Acquire);
            Global.dealloc(self.ptr.cast(), Layout::for_value(self.ptr.as_ref()))
        }
    }

// Weak::drop
    fn drop(&mut self) {
        // If we find out that we were the last weak pointer, then its time to
        // deallocate the data entirely. See the discussion in Arc::drop() about
        // the memory orderings
        //
        // It's not necessary to check for the locked state here, because the
        // weak count can only be locked if there was precisely one weak ref,
        // meaning that drop could only subsequently run ON that remaining weak
        // ref, which can only happen after the lock is released.
        let inner: &ArcInner = if let Some(inner) = self.inner() { inner } else { return };

        if inner.weak.fetch_sub(1, Release) == 1 {
            atomic::fence(Acquire);
            unsafe { Global.dealloc(self.ptr.cast(), Layout::for_value(self.ptr.as_ref())) }
        }
    }

I think both of these suffer from the same defect. Specifically, the danger zones are:

(All MIR generated on this playground, which copies the relevant implementation of Arc and Weak.)

MIR analysis ```rust fn ::drop(_1: &mut Arc) -> () { debug self => _1; // in scope 0 at src/lib.rs:60:13: 60:22 let mut _0: (); // return place in scope 0 at src/lib.rs:60:24: 60:24 let mut _2: bool; // in scope 0 at src/lib.rs:61:12: 61:58 let mut _3: usize; // in scope 0 at src/lib.rs:61:12: 61:53 let mut _4: &std::sync::atomic::AtomicUsize; // in scope 0 at src/lib.rs:61:12: 61:31 // XXXXXX We care about _5, the reference to ArcInner let _5: &ArcInner; // in scope 0 at src/lib.rs:61:12: 61:24 let mut _6: &Arc; // in scope 0 at src/lib.rs:61:12: 61:16 let mut _7: std::sync::atomic::Ordering; // in scope 0 at src/lib.rs:61:45: 61:52 let _8: (); // in scope 0 at src/lib.rs:13:9: 13:31 let mut _9: std::sync::atomic::Ordering; // in scope 0 at src/lib.rs:13:23: 13:30 let _10: (); // in scope 0 at src/lib.rs:68:13: 68:29 let mut _11: &mut Arc; // in scope 0 at src/lib.rs:68:13: 68:17 scope 1 { } bb0: { StorageLive(_2); // bb0[0]: scope 0 at src/lib.rs:61:12: 61:58 StorageLive(_3); // bb0[1]: scope 0 at src/lib.rs:61:12: 61:53 StorageLive(_4); // bb0[2]: scope 0 at src/lib.rs:61:12: 61:31 // XXXXXX _5 is marked live here StorageLive(_5); // bb0[3]: scope 0 at src/lib.rs:61:12: 61:24 StorageLive(_6); // bb0[4]: scope 0 at src/lib.rs:61:12: 61:16 _6 = _1; // bb0[5]: scope 0 at src/lib.rs:61:12: 61:16 // XXXXXX _5 is created here _5 = const Arc::::inner(move _6) -> bb1; // bb0[6]: scope 0 at src/lib.rs:61:12: 61:24 // ty::Const // + ty: for<'r> fn(&'r Arc) -> &'r ArcInner {Arc::::inner} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:61:17: 61:22 // + literal: Const { ty: for<'r> fn(&'r Arc) -> &'r ArcInner {Arc::::inner}, val: Value(Scalar()) } } bb1: { StorageDead(_6); // bb1[0]: scope 0 at src/lib.rs:61:23: 61:24 _4 = &((*_5).0: std::sync::atomic::AtomicUsize); // bb1[1]: scope 0 at src/lib.rs:61:12: 61:31 StorageLive(_7); // bb1[2]: scope 0 at src/lib.rs:61:45: 61:52 discriminant(_7) = 1; // bb1[3]: scope 0 at src/lib.rs:61:45: 61:52 // XXXXXX fetch_sub is exectued here _3 = const std::sync::atomic::AtomicUsize::fetch_sub(move _4, const 1usize, move _7) -> bb2; // bb1[4]: scope 0 at src/lib.rs:61:12: 61:53 // ty::Const // + ty: for<'r> fn(&'r std::sync::atomic::AtomicUsize, usize, std::sync::atomic::Ordering) -> usize {std::sync::atomic::AtomicUsize::fetch_sub} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:61:32: 61:41 // + literal: Const { ty: for<'r> fn(&'r std::sync::atomic::AtomicUsize, usize, std::sync::atomic::Ordering) -> usize {std::sync::atomic::AtomicUsize::fetch_sub}, val: Value(Scalar()) } // ty::Const // + ty: usize // + val: Value(Scalar(0x0000000000000001)) // mir::Constant // + span: src/lib.rs:61:42: 61:43 // + literal: Const { ty: usize, val: Value(Scalar(0x0000000000000001)) } } bb2: { // XXXXXX _5 may be dangling here StorageDead(_7); // bb2[0]: scope 0 at src/lib.rs:61:52: 61:53 StorageDead(_4); // bb2[1]: scope 0 at src/lib.rs:61:52: 61:53 _2 = Ne(move _3, const 1usize); // bb2[2]: scope 0 at src/lib.rs:61:12: 61:58 // ty::Const // + ty: usize // + val: Value(Scalar(0x0000000000000001)) // mir::Constant // + span: src/lib.rs:61:57: 61:58 // + literal: Const { ty: usize, val: Value(Scalar(0x0000000000000001)) } // XXXXX _5 is marked dead here StorageDead(_5); // bb2[3]: scope 0 at src/lib.rs:61:57: 61:58 StorageDead(_3); // bb2[4]: scope 0 at src/lib.rs:61:57: 61:58 switchInt(_2) -> [false: bb3, otherwise: bb4]; // bb2[5]: scope 0 at src/lib.rs:61:9: 63:10 } bb3: { StorageDead(_2); // bb3[0]: scope 0 at src/lib.rs:63:9: 63:10 StorageLive(_8); // bb3[1]: scope 0 at src/lib.rs:13:9: 13:31 StorageLive(_9); // bb3[2]: scope 0 at src/lib.rs:13:23: 13:30 discriminant(_9) = 2; // bb3[3]: scope 0 at src/lib.rs:13:23: 13:30 _8 = const std::sync::atomic::fence(move _9) -> bb6; // bb3[4]: scope 0 at src/lib.rs:13:9: 13:31 // ty::Const // + ty: fn(std::sync::atomic::Ordering) {std::sync::atomic::fence} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:13:9: 13:22 // + literal: Const { ty: fn(std::sync::atomic::Ordering) {std::sync::atomic::fence}, val: Value(Scalar()) } } bb4: { StorageDead(_2); // bb4[0]: scope 0 at src/lib.rs:63:9: 63:10 goto -> bb5; // bb4[1]: scope 0 at src/lib.rs:62:13: 62:19 } bb5: { return; // bb5[0]: scope 0 at src/lib.rs:70:6: 70:6 } bb6: { StorageDead(_9); // bb6[0]: scope 0 at src/lib.rs:13:30: 13:31 StorageDead(_8); // bb6[1]: scope 0 at src/lib.rs:13:30: 13:31 StorageLive(_10); // bb6[2]: scope 1 at src/lib.rs:68:13: 68:29 StorageLive(_11); // bb6[3]: scope 1 at src/lib.rs:68:13: 68:17 _11 = _1; // bb6[4]: scope 1 at src/lib.rs:68:13: 68:17 _10 = const Arc::::drop_slow(move _11) -> bb7; // bb6[5]: scope 1 at src/lib.rs:68:13: 68:29 // ty::Const // + ty: for<'r> unsafe fn(&'r mut Arc) {Arc::::drop_slow} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:68:18: 68:27 // + literal: Const { ty: for<'r> unsafe fn(&'r mut Arc) {Arc::::drop_slow}, val: Value(Scalar()) } } bb7: { StorageDead(_11); // bb7[0]: scope 1 at src/lib.rs:68:28: 68:29 StorageDead(_10); // bb7[1]: scope 1 at src/lib.rs:68:29: 68:30 goto -> bb5; // bb7[2]: scope 0 at src/lib.rs:70:6: 70:6 } } ```
MIR analysis ```rust // WARNING: This output format is intended for human consumers only // and is subject to change without notice. Knock yourself out. fn ::drop_slow(_1: &mut Arc) -> () { debug self => _1; // in scope 0 at src/lib.rs:42:25: 42:34 let mut _0: (); // return place in scope 0 at src/lib.rs:42:36: 42:36 let _2: (); // in scope 0 at src/lib.rs:43:9: 43:56 let mut _3: *mut T; // in scope 0 at src/lib.rs:43:28: 43:55 let mut _4: &mut T; // in scope 0 at src/lib.rs:43:28: 43:55 // XXXXXX This ArcInner is never dangling let mut _5: &mut ArcInner; // in scope 0 at src/lib.rs:43:33: 43:50 let mut _6: &mut std::ptr::NonNull>; // in scope 0 at src/lib.rs:43:33: 43:41 let mut _7: bool; // in scope 0 at src/lib.rs:45:12: 45:56 let mut _8: usize; // in scope 0 at src/lib.rs:45:12: 45:51 let mut _9: &std::sync::atomic::AtomicUsize; // in scope 0 at src/lib.rs:45:12: 45:29 // XXXXXX We care about _10, the reference to ArcInner let _10: &ArcInner; // in scope 0 at src/lib.rs:45:12: 45:24 let mut _11: &Arc; // in scope 0 at src/lib.rs:45:12: 45:16 let mut _12: std::sync::atomic::Ordering; // in scope 0 at src/lib.rs:45:43: 45:50 let _13: (); // in scope 0 at src/lib.rs:13:9: 13:31 let mut _14: std::sync::atomic::Ordering; // in scope 0 at src/lib.rs:13:23: 13:30 let mut _15: &mut std::alloc::Global; // in scope 0 at src/lib.rs:47:13: 47:19 let mut _16: std::alloc::Global; // in scope 0 at src/lib.rs:47:13: 47:19 let mut _17: std::ptr::NonNull; // in scope 0 at src/lib.rs:47:28: 47:43 let mut _18: std::ptr::NonNull>; // in scope 0 at src/lib.rs:47:28: 47:36 let mut _19: std::alloc::Layout; // in scope 0 at src/lib.rs:47:45: 47:81 // XXXXXX This ArcInner is never dangling let mut _20: &ArcInner; // in scope 0 at src/lib.rs:47:63: 47:80 // XXXXXX We care about _21, the reference to ArcInner let _21: &ArcInner; // in scope 0 at src/lib.rs:47:63: 47:80 let mut _22: &std::ptr::NonNull>; // in scope 0 at src/lib.rs:47:63: 47:71 bb0: { StorageLive(_2); // bb0[0]: scope 0 at src/lib.rs:43:9: 43:56 StorageLive(_3); // bb0[1]: scope 0 at src/lib.rs:43:28: 43:55 StorageLive(_4); // bb0[2]: scope 0 at src/lib.rs:43:28: 43:55 // XXXXXX _5 is marked live here StorageLive(_5); // bb0[3]: scope 0 at src/lib.rs:43:33: 43:50 StorageLive(_6); // bb0[4]: scope 0 at src/lib.rs:43:33: 43:41 _6 = &mut ((*_1).0: std::ptr::NonNull>); // bb0[5]: scope 0 at src/lib.rs:43:33: 43:41 _5 = const std::ptr::NonNull::>::as_mut(move _6) -> bb1; // bb0[6]: scope 0 at src/lib.rs:43:33: 43:50 // ty::Const // + ty: for<'r> unsafe fn(&'r mut std::ptr::NonNull>) -> &'r mut ArcInner {std::ptr::NonNull::>::as_mut} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:43:42: 43:48 // + literal: Const { ty: for<'r> unsafe fn(&'r mut std::ptr::NonNull>) -> &'r mut ArcInner {std::ptr::NonNull::>::as_mut}, val: Value(Scalar()) } } bb1: { StorageDead(_6); // bb1[0]: scope 0 at src/lib.rs:43:49: 43:50 _4 = &mut ((*_5).2: T); // bb1[1]: scope 0 at src/lib.rs:43:28: 43:55 _3 = &raw mut (*_4); // bb1[2]: scope 0 at src/lib.rs:43:28: 43:55 _2 = const std::intrinsics::drop_in_place::(move _3) -> bb2; // bb1[3]: scope 0 at src/lib.rs:43:9: 43:56 // ty::Const // + ty: unsafe fn(*mut T) {std::intrinsics::drop_in_place::} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:43:9: 43:27 // + literal: Const { ty: unsafe fn(*mut T) {std::intrinsics::drop_in_place::}, val: Value(Scalar()) } } bb2: { StorageDead(_3); // bb2[0]: scope 0 at src/lib.rs:43:55: 43:56 // XXXXXX _5 is marked dead here StorageDead(_5); // bb2[1]: scope 0 at src/lib.rs:43:56: 43:57 StorageDead(_4); // bb2[2]: scope 0 at src/lib.rs:43:56: 43:57 StorageDead(_2); // bb2[3]: scope 0 at src/lib.rs:43:56: 43:57 StorageLive(_7); // bb2[4]: scope 0 at src/lib.rs:45:12: 45:56 StorageLive(_8); // bb2[5]: scope 0 at src/lib.rs:45:12: 45:51 StorageLive(_9); // bb2[6]: scope 0 at src/lib.rs:45:12: 45:29 // XXXXXX _10 is marked live here StorageLive(_10); // bb2[7]: scope 0 at src/lib.rs:45:12: 45:24 StorageLive(_11); // bb2[8]: scope 0 at src/lib.rs:45:12: 45:16 _11 = _1; // bb2[9]: scope 0 at src/lib.rs:45:12: 45:16 // XXXXXX _10 is created here _10 = const Arc::::inner(move _11) -> bb3; // bb2[10]: scope 0 at src/lib.rs:45:12: 45:24 // ty::Const // + ty: for<'r> fn(&'r Arc) -> &'r ArcInner {Arc::::inner} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:45:17: 45:22 // + literal: Const { ty: for<'r> fn(&'r Arc) -> &'r ArcInner {Arc::::inner}, val: Value(Scalar()) } } bb3: { StorageDead(_11); // bb3[0]: scope 0 at src/lib.rs:45:23: 45:24 _9 = &((*_10).1: std::sync::atomic::AtomicUsize); // bb3[1]: scope 0 at src/lib.rs:45:12: 45:29 StorageLive(_12); // bb3[2]: scope 0 at src/lib.rs:45:43: 45:50 discriminant(_12) = 1; // bb3[3]: scope 0 at src/lib.rs:45:43: 45:50 // XXXXXX fetch_sub is executed here _8 = const std::sync::atomic::AtomicUsize::fetch_sub(move _9, const 1usize, move _12) -> bb4; // bb3[4]: scope 0 at src/lib.rs:45:12: 45:51 // ty::Const // + ty: for<'r> fn(&'r std::sync::atomic::AtomicUsize, usize, std::sync::atomic::Ordering) -> usize {std::sync::atomic::AtomicUsize::fetch_sub} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:45:30: 45:39 // + literal: Const { ty: for<'r> fn(&'r std::sync::atomic::AtomicUsize, usize, std::sync::atomic::Ordering) -> usize {std::sync::atomic::AtomicUsize::fetch_sub}, val: Value(Scalar()) } // ty::Const // + ty: usize // + val: Value(Scalar(0x0000000000000001)) // mir::Constant // + span: src/lib.rs:45:40: 45:41 // + literal: Const { ty: usize, val: Value(Scalar(0x0000000000000001)) } } bb4: { // XXXXXX _10 may be dangling here StorageDead(_12); // bb4[0]: scope 0 at src/lib.rs:45:50: 45:51 StorageDead(_9); // bb4[1]: scope 0 at src/lib.rs:45:50: 45:51 _7 = Eq(move _8, const 1usize); // bb4[2]: scope 0 at src/lib.rs:45:12: 45:56 // ty::Const // + ty: usize // + val: Value(Scalar(0x0000000000000001)) // mir::Constant // + span: src/lib.rs:45:55: 45:56 // + literal: Const { ty: usize, val: Value(Scalar(0x0000000000000001)) } // XXXXXX _10 is marked dead here StorageDead(_10); // bb4[3]: scope 0 at src/lib.rs:45:55: 45:56 StorageDead(_8); // bb4[4]: scope 0 at src/lib.rs:45:55: 45:56 switchInt(_7) -> [false: bb11, otherwise: bb5]; // bb4[5]: scope 0 at src/lib.rs:45:9: 48:10 } bb5: { StorageLive(_13); // bb5[0]: scope 0 at src/lib.rs:13:9: 13:31 StorageLive(_14); // bb5[1]: scope 0 at src/lib.rs:13:23: 13:30 discriminant(_14) = 2; // bb5[2]: scope 0 at src/lib.rs:13:23: 13:30 _13 = const std::sync::atomic::fence(move _14) -> bb6; // bb5[3]: scope 0 at src/lib.rs:13:9: 13:31 // ty::Const // + ty: fn(std::sync::atomic::Ordering) {std::sync::atomic::fence} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:13:9: 13:22 // + literal: Const { ty: fn(std::sync::atomic::Ordering) {std::sync::atomic::fence}, val: Value(Scalar()) } } bb6: { StorageDead(_14); // bb6[0]: scope 0 at src/lib.rs:13:30: 13:31 StorageDead(_13); // bb6[1]: scope 0 at src/lib.rs:13:30: 13:31 StorageLive(_15); // bb6[2]: scope 0 at src/lib.rs:47:13: 47:19 StorageLive(_16); // bb6[3]: scope 0 at src/lib.rs:47:13: 47:19 _15 = &mut _16; // bb6[4]: scope 0 at src/lib.rs:47:13: 47:19 StorageLive(_17); // bb6[5]: scope 0 at src/lib.rs:47:28: 47:43 StorageLive(_18); // bb6[6]: scope 0 at src/lib.rs:47:28: 47:36 _18 = ((*_1).0: std::ptr::NonNull>); // bb6[7]: scope 0 at src/lib.rs:47:28: 47:36 _17 = const std::ptr::NonNull::>::cast::(move _18) -> bb7; // bb6[8]: scope 0 at src/lib.rs:47:28: 47:43 // ty::Const // + ty: fn(std::ptr::NonNull>) -> std::ptr::NonNull {std::ptr::NonNull::>::cast::} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:47:37: 47:41 // + literal: Const { ty: fn(std::ptr::NonNull>) -> std::ptr::NonNull {std::ptr::NonNull::>::cast::}, val: Value(Scalar()) } } bb7: { StorageDead(_18); // bb7[0]: scope 0 at src/lib.rs:47:42: 47:43 StorageLive(_19); // bb7[1]: scope 0 at src/lib.rs:47:45: 47:81 // XXXXXX _20 is marked live here StorageLive(_20); // bb7[2]: scope 0 at src/lib.rs:47:63: 47:80 // XXXXXX _21 is marked live here StorageLive(_21); // bb7[3]: scope 0 at src/lib.rs:47:63: 47:80 StorageLive(_22); // bb7[4]: scope 0 at src/lib.rs:47:63: 47:71 _22 = &((*_1).0: std::ptr::NonNull>); // bb7[5]: scope 0 at src/lib.rs:47:63: 47:71 // XXXXXX _21 is created here _21 = const std::ptr::NonNull::>::as_ref(move _22) -> bb8; // bb7[6]: scope 0 at src/lib.rs:47:63: 47:80 // ty::Const // + ty: for<'r> unsafe fn(&'r std::ptr::NonNull>) -> &'r ArcInner {std::ptr::NonNull::>::as_ref} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:47:72: 47:78 // + literal: Const { ty: for<'r> unsafe fn(&'r std::ptr::NonNull>) -> &'r ArcInner {std::ptr::NonNull::>::as_ref}, val: Value(Scalar()) } } bb8: { // XXXXXX _20 is created here _20 = _21; // bb8[0]: scope 0 at src/lib.rs:47:63: 47:80 StorageDead(_22); // bb8[1]: scope 0 at src/lib.rs:47:79: 47:80 _19 = const std::alloc::Layout::for_value::>(move _20) -> bb9; // bb8[2]: scope 0 at src/lib.rs:47:45: 47:81 // ty::Const // + ty: for<'r> fn(&'r ArcInner) -> std::alloc::Layout {std::alloc::Layout::for_value::>} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:47:45: 47:62 // + user_ty: UserType(0) // + literal: Const { ty: for<'r> fn(&'r ArcInner) -> std::alloc::Layout {std::alloc::Layout::for_value::>}, val: Value(Scalar()) } } bb9: { // XXXXXX _20 is marked dead here StorageDead(_20); // bb9[0]: scope 0 at src/lib.rs:47:80: 47:81 // XXXXXX ArcInner is dealloc'd here _0 = const ::dealloc(move _15, move _17, move _19) -> bb10; // bb9[1]: scope 0 at src/lib.rs:47:13: 47:82 // ty::Const // + ty: for<'r> unsafe fn(&'r mut std::alloc::Global, std::ptr::NonNull, std::alloc::Layout) {::dealloc} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:47:20: 47:27 // + literal: Const { ty: for<'r> unsafe fn(&'r mut std::alloc::Global, std::ptr::NonNull, std::alloc::Layout) {::dealloc}, val: Value(Scalar()) } } bb10: { // XXXXXX _21 IS DEFINITELY DANGLING XXXXXX StorageDead(_19); // bb10[0]: scope 0 at src/lib.rs:47:81: 47:82 StorageDead(_17); // bb10[1]: scope 0 at src/lib.rs:47:81: 47:82 StorageDead(_15); // bb10[2]: scope 0 at src/lib.rs:47:81: 47:82 // XXXXXX _21 is marked dead here StorageDead(_21); // bb10[3]: scope 0 at src/lib.rs:48:9: 48:10 StorageDead(_16); // bb10[4]: scope 0 at src/lib.rs:48:9: 48:10 goto -> bb11; // bb10[5]: scope 0 at src/lib.rs:45:9: 48:10 } bb11: { StorageDead(_7); // bb11[0]: scope 0 at src/lib.rs:49:5: 49:6 return; // bb11[1]: scope 0 at src/lib.rs:49:6: 49:6 } } ```

NOTE: _21 is the Layout::for_ref reference to ArcInner, and based solely on the StorageDead MIR instruction, lives until after the dealloc, so is a reference to freed memory. #70686

MIR analysis ```rust fn ::drop(_1: &mut Weak) -> () { debug self => _1; // in scope 0 at src/lib.rs:74:13: 74:22 let mut _0: (); // return place in scope 0 at src/lib.rs:74:24: 74:24 // XXXXXX We care about _2, the reference to ArcInner let _2: &ArcInner; // in scope 0 at src/lib.rs:75:13: 75:18 let mut _3: std::option::Option<&ArcInner>; // in scope 0 at src/lib.rs:75:42: 75:54 let mut _4: &Weak; // in scope 0 at src/lib.rs:75:42: 75:46 let mut _5: isize; // in scope 0 at src/lib.rs:75:28: 75:39 // XXXXXX This ArcInner is never dangling let _6: &ArcInner; // in scope 0 at src/lib.rs:75:33: 75:38 let mut _7: bool; // in scope 0 at src/lib.rs:77:12: 77:49 let mut _8: usize; // in scope 0 at src/lib.rs:77:12: 77:44 let mut _9: &std::sync::atomic::AtomicUsize; // in scope 0 at src/lib.rs:77:12: 77:22 let mut _10: std::sync::atomic::Ordering; // in scope 0 at src/lib.rs:77:36: 77:43 let _11: (); // in scope 0 at src/lib.rs:78:13: 78:35 let mut _12: std::sync::atomic::Ordering; // in scope 0 at src/lib.rs:78:27: 78:34 let mut _13: &mut std::alloc::Global; // in scope 0 at src/lib.rs:79:22: 79:28 let mut _14: std::alloc::Global; // in scope 0 at src/lib.rs:79:22: 79:28 let mut _15: std::ptr::NonNull; // in scope 0 at src/lib.rs:79:37: 79:52 let mut _16: std::ptr::NonNull>; // in scope 0 at src/lib.rs:79:37: 79:45 let mut _17: std::alloc::Layout; // in scope 0 at src/lib.rs:79:54: 79:90 // XXXXXX This ArcInner is never dangling let mut _18: &ArcInner; // in scope 0 at src/lib.rs:79:72: 79:89 // XXXXXX This ArcInner is never dangling let _19: &ArcInner; // in scope 0 at src/lib.rs:79:72: 79:89 let mut _20: &std::ptr::NonNull>; // in scope 0 at src/lib.rs:79:72: 79:80 scope 1 { debug inner => _2; // in scope 1 at src/lib.rs:75:13: 75:18 scope 3 { } } scope 2 { debug inner => _6; // in scope 2 at src/lib.rs:75:33: 75:38 } bb0: { // XXXXXX _2 is marked live here StorageLive(_2); // bb0[0]: scope 0 at src/lib.rs:75:13: 75:18 StorageLive(_3); // bb0[1]: scope 0 at src/lib.rs:75:42: 75:54 StorageLive(_4); // bb0[2]: scope 0 at src/lib.rs:75:42: 75:46 _4 = _1; // bb0[3]: scope 0 at src/lib.rs:75:42: 75:46 _3 = const Weak::::inner(move _4) -> bb1; // bb0[4]: scope 0 at src/lib.rs:75:42: 75:54 // ty::Const // + ty: for<'r> fn(&'r Weak) -> std::option::Option<&'r ArcInner> {Weak::::inner} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:75:47: 75:52 // + literal: Const { ty: for<'r> fn(&'r Weak) -> std::option::Option<&'r ArcInner> {Weak::::inner}, val: Value(Scalar()) } } bb1: { StorageDead(_4); // bb1[0]: scope 0 at src/lib.rs:75:53: 75:54 _5 = discriminant(_3); // bb1[1]: scope 0 at src/lib.rs:75:28: 75:39 switchInt(move _5) -> [1isize: bb3, otherwise: bb2]; // bb1[2]: scope 0 at src/lib.rs:75:28: 75:39 } bb2: { StorageDead(_3); // bb2[0]: scope 0 at src/lib.rs:75:80: 75:81 // XXXXXX _2 is marked dead here StorageDead(_2); // bb2[1]: scope 0 at src/lib.rs:81:5: 81:6 goto -> bb4; // bb2[2]: scope 0 at src/lib.rs:75:72: 75:78 } bb3: { // XXXXXX _6 is marked live here StorageLive(_6); // bb3[0]: scope 0 at src/lib.rs:75:33: 75:38 // XXXXXX _6 is created here _6 = ((_3 as Some).0: &ArcInner); // bb3[1]: scope 0 at src/lib.rs:75:33: 75:38 // XXXXXX _2 is created here _2 = _6; // bb3[2]: scope 2 at src/lib.rs:75:57: 75:62 // XXXXXX _6 is marked dead here StorageDead(_6); // bb3[3]: scope 0 at src/lib.rs:75:63: 75:64 StorageDead(_3); // bb3[4]: scope 0 at src/lib.rs:75:80: 75:81 StorageLive(_7); // bb3[5]: scope 1 at src/lib.rs:77:12: 77:49 StorageLive(_8); // bb3[6]: scope 1 at src/lib.rs:77:12: 77:44 StorageLive(_9); // bb3[7]: scope 1 at src/lib.rs:77:12: 77:22 _9 = &((*_2).1: std::sync::atomic::AtomicUsize); // bb3[8]: scope 1 at src/lib.rs:77:12: 77:22 StorageLive(_10); // bb3[9]: scope 1 at src/lib.rs:77:36: 77:43 discriminant(_10) = 1; // bb3[10]: scope 1 at src/lib.rs:77:36: 77:43 // XXXXXX fetch_sub is executed here _8 = const std::sync::atomic::AtomicUsize::fetch_sub(move _9, const 1usize, move _10) -> bb5; // bb3[11]: scope 1 at src/lib.rs:77:12: 77:44 // ty::Const // + ty: for<'r> fn(&'r std::sync::atomic::AtomicUsize, usize, std::sync::atomic::Ordering) -> usize {std::sync::atomic::AtomicUsize::fetch_sub} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:77:23: 77:32 // + literal: Const { ty: for<'r> fn(&'r std::sync::atomic::AtomicUsize, usize, std::sync::atomic::Ordering) -> usize {std::sync::atomic::AtomicUsize::fetch_sub}, val: Value(Scalar()) } // ty::Const // + ty: usize // + val: Value(Scalar(0x0000000000000001)) // mir::Constant // + span: src/lib.rs:77:33: 77:34 // + literal: Const { ty: usize, val: Value(Scalar(0x0000000000000001)) } } bb4: { return; // bb4[0]: scope 0 at src/lib.rs:81:6: 81:6 } bb5: { // XXXXXX _2 may be dangling here StorageDead(_10); // bb5[0]: scope 1 at src/lib.rs:77:43: 77:44 StorageDead(_9); // bb5[1]: scope 1 at src/lib.rs:77:43: 77:44 _7 = Eq(move _8, const 1usize); // bb5[2]: scope 1 at src/lib.rs:77:12: 77:49 // ty::Const // + ty: usize // + val: Value(Scalar(0x0000000000000001)) // mir::Constant // + span: src/lib.rs:77:48: 77:49 // + literal: Const { ty: usize, val: Value(Scalar(0x0000000000000001)) } StorageDead(_8); // bb5[3]: scope 1 at src/lib.rs:77:48: 77:49 switchInt(_7) -> [false: bb12, otherwise: bb6]; // bb5[4]: scope 1 at src/lib.rs:77:9: 80:10 } bb6: { // XXXXXX ArcInner is still live, drop it StorageLive(_11); // bb6[0]: scope 1 at src/lib.rs:78:13: 78:35 StorageLive(_12); // bb6[1]: scope 1 at src/lib.rs:78:27: 78:34 discriminant(_12) = 2; // bb6[2]: scope 1 at src/lib.rs:78:27: 78:34 _11 = const std::sync::atomic::fence(move _12) -> bb7; // bb6[3]: scope 1 at src/lib.rs:78:13: 78:35 // ty::Const // + ty: fn(std::sync::atomic::Ordering) {std::sync::atomic::fence} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:78:13: 78:26 // + literal: Const { ty: fn(std::sync::atomic::Ordering) {std::sync::atomic::fence}, val: Value(Scalar()) } } bb7: { StorageDead(_12); // bb7[0]: scope 1 at src/lib.rs:78:34: 78:35 StorageDead(_11); // bb7[1]: scope 1 at src/lib.rs:78:35: 78:36 StorageLive(_13); // bb7[2]: scope 3 at src/lib.rs:79:22: 79:28 StorageLive(_14); // bb7[3]: scope 3 at src/lib.rs:79:22: 79:28 _13 = &mut _14; // bb7[4]: scope 3 at src/lib.rs:79:22: 79:28 StorageLive(_15); // bb7[5]: scope 3 at src/lib.rs:79:37: 79:52 StorageLive(_16); // bb7[6]: scope 3 at src/lib.rs:79:37: 79:45 _16 = ((*_1).0: std::ptr::NonNull>); // bb7[7]: scope 3 at src/lib.rs:79:37: 79:45 _15 = const std::ptr::NonNull::>::cast::(move _16) -> bb8; // bb7[8]: scope 3 at src/lib.rs:79:37: 79:52 // ty::Const // + ty: fn(std::ptr::NonNull>) -> std::ptr::NonNull {std::ptr::NonNull::>::cast::} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:79:46: 79:50 // + literal: Const { ty: fn(std::ptr::NonNull>) -> std::ptr::NonNull {std::ptr::NonNull::>::cast::}, val: Value(Scalar()) } } bb8: { StorageDead(_16); // bb8[0]: scope 3 at src/lib.rs:79:51: 79:52 StorageLive(_17); // bb8[1]: scope 3 at src/lib.rs:79:54: 79:90 // XXXXXX _18 is marked live here StorageLive(_18); // bb8[2]: scope 3 at src/lib.rs:79:72: 79:89 // XXXXXX _19 is marked live here StorageLive(_19); // bb8[3]: scope 3 at src/lib.rs:79:72: 79:89 StorageLive(_20); // bb8[4]: scope 3 at src/lib.rs:79:72: 79:80 _20 = &((*_1).0: std::ptr::NonNull>); // bb8[5]: scope 3 at src/lib.rs:79:72: 79:80 // XXXXXX _19 is created here _19 = const std::ptr::NonNull::>::as_ref(move _20) -> bb9; // bb8[6]: scope 3 at src/lib.rs:79:72: 79:89 // ty::Const // + ty: for<'r> unsafe fn(&'r std::ptr::NonNull>) -> &'r ArcInner {std::ptr::NonNull::>::as_ref} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:79:81: 79:87 // + literal: Const { ty: for<'r> unsafe fn(&'r std::ptr::NonNull>) -> &'r ArcInner {std::ptr::NonNull::>::as_ref}, val: Value(Scalar()) } } bb9: { // XXXXXX _18 is created here _18 = _19; // bb9[0]: scope 3 at src/lib.rs:79:72: 79:89 StorageDead(_20); // bb9[1]: scope 3 at src/lib.rs:79:88: 79:89 _17 = const std::alloc::Layout::for_value::>(move _18) -> bb10; // bb9[2]: scope 3 at src/lib.rs:79:54: 79:90 // ty::Const // + ty: for<'r> fn(&'r ArcInner) -> std::alloc::Layout {std::alloc::Layout::for_value::>} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:79:54: 79:71 // + user_ty: UserType(0) // + literal: Const { ty: for<'r> fn(&'r ArcInner) -> std::alloc::Layout {std::alloc::Layout::for_value::>}, val: Value(Scalar()) } } bb10: { // XXXXXX _18 is marked dead here StorageDead(_18); // bb10[0]: scope 3 at src/lib.rs:79:89: 79:90 // XXXXXX ArcInner is dealloc'd _0 = const ::dealloc(move _13, move _15, move _17) -> bb11; // bb10[1]: scope 3 at src/lib.rs:79:22: 79:91 // ty::Const // + ty: for<'r> unsafe fn(&'r mut std::alloc::Global, std::ptr::NonNull, std::alloc::Layout) {::dealloc} // + val: Value(Scalar()) // mir::Constant // + span: src/lib.rs:79:29: 79:36 // + literal: Const { ty: for<'r> unsafe fn(&'r mut std::alloc::Global, std::ptr::NonNull, std::alloc::Layout) {::dealloc}, val: Value(Scalar()) } } bb11: { // XXXXXX _19 IS DEFINITELY DANGLING XXXXXX // XXXXXX _2 IS DEFINITELY DANGLING XXXXXX StorageDead(_17); // bb11[0]: scope 3 at src/lib.rs:79:90: 79:91 StorageDead(_15); // bb11[1]: scope 3 at src/lib.rs:79:90: 79:91 StorageDead(_13); // bb11[2]: scope 3 at src/lib.rs:79:90: 79:91 // XXXXXX _19 is marked dead here StorageDead(_19); // bb11[3]: scope 1 at src/lib.rs:80:9: 80:10 StorageDead(_14); // bb11[4]: scope 1 at src/lib.rs:80:9: 80:10 goto -> bb12; // bb11[5]: scope 1 at src/lib.rs:77:9: 80:10 } bb12: { // XXXXXX _2 is marked dead here StorageDead(_2); // bb12[0]: scope 0 at src/lib.rs:81:5: 81:6 StorageDead(_7); // bb12[1]: scope 0 at src/lib.rs:81:5: 81:6 goto -> bb4; // bb12[2]: scope 0 at src/lib.rs:81:6: 81:6 } } ```

NOTE: Weak::drop will always have a dangling reference when dropping the ArcInner!


Semantically, it seems like (barring a #[may_dangle]-like solution) all of Arc::drop, Arc::drop_slow, and Weak::drop need to have some extra synchronization signal that they're holding a &ArcInner and not to deallocate it yet. The problem, of course, is that said synchronization signal is inside the ArcInner.

comex commented 4 years ago

#[unprotected]

That's #[may_dangle], right? Or at the very least, it's closely related.

I hadn't thought of that. In a way, the two are opposites. #[may_dangle] (an attribute on type or lifetime parameters of Drop impl blocks) is a way for the programmer to tell the compiler: "I promise my code won't assume this lifetime is valid, so you can allow it to become dangling". My proposed #[unprotected], on the other hand, would tell the compiler: "You must promise not to assume this reference is valid, because my code might cause it to become dangling".

  • inside Arc::drop_slow, temporary _: &ArcInner

So, if I understand Stacked Borrows correctly, it actually allows references in general to become dangling before they go out of scope. Only references that happen to be function arguments are special: they're given a "protector", which makes it insta-UB to invalidate the reference for the duration of the function execution. That makes the situation somewhat more tractable.

RalfJung commented 4 years ago

So, if I understand Stacked Borrows correctly, it actually allows references in general to become dangling before they go out of scope. Only references that happen to be function arguments are special: they're given a "protector", which makes it insta-UB to invalidate the reference for the duration of the function execution. That makes the situation somewhat more tractable.

Exactly. References generally only have to be "valid when used" -- though "used" is interpreted very coarsely and includes reborrows and (typed) copies, not just dereferences. The only exception are references passed as function arguments.

[unprotected]

Interesting, I would have made this a property of the type, not the individual functions. While it is true that the semantics for this work at the granularity of functions, I feel like annotating types is more tractable.

Centril commented 4 years ago

@RalfJung Could you elaborate on the tractability of annotations on types vs. functions? (Presumably by "on types" you mean where the type is defined, not where it is used.)

RalfJung commented 4 years ago

Well I just meant that you define the type just once and so won't forget to annotate every single function that needs this. We could even make UnsafeCell have this effect to provide some backwards compatibility.

However in the lang team meeting about this we also identified that this is not sufficient to solve dereferencable issues for other cases that do not involve interior mutability, such as Drain.

nikomatsakis commented 4 years ago

I was debating about proposing the "opt-out" that @comex describes, but I'm not sure how I feel about it yet. It would definitely be easy to forget, we'd need some lints and extra checks I think to help in propagating those annotations.

You can certainly start to see the appeal of having a distinct type that is basically a "reference that may dangle on exit" -- but then, in some sense we have that type, it's called a raw pointer (ok, it's a bit more extreme), and the whole problem is that we have stable APIs that don't use raw pointers.

But also, there is one more point that I think is important: it's not great to have the only two options be "full safety, with all that implies" and "fully raw, with all that implies". It's really useful in practice to have intermediate levels of safety, even if they're more like "lints" that guide you into correct code than hard guarantees. I think this is why it feels a bit wrong to me to say that we should have raw pointers all throughout these APIs.

More and more, I do feel like we are going to want to special case UnsafeCell in terms of saying "a reference to an UnsafeCell may dangle on exit", even if it's not a full solution, because it helps to address both backwards compatibility and what it is clearly a pretty common case.

nikomatsakis commented 4 years ago

I was thinking about this some more. The rayon bug that we encountered was a typical data-race sort of bug -- it took us quite some time to track down the crash, and the code "appears" quite innocuous. This makes me wonder. If we had had the stricter rules, in which &self cannot be invalidated in the middle of the function's lifetime, and had had some way to dynamically test those rules, we would have gotten an error much earlier. This would then have required us to alter the type to *const Self -- or perhaps even some other newtype'd wrapper like LiveOnEntry<Self> -- which might then have raised awareness of the underlying problem. Food for thought.

nikomatsakis commented 4 years ago

So it seems like the most actionable thing we can do at this juncture is to create functions on the Atomic operations that take raw pointers, and to encourage folks to use this if performing the atomic operation may have the side-effect of freeing the memory in which the atomic itself resides.

This implies, in turn, that the methods on the Arc or whatever ought to be using raw pointers as well instead of &self and friends.

This suggests to me that a good enhancement would be to pursue some way to have raw-pointer methods available. This would take a bit of design work and experimentation, but at least within e.g. rayon it would be an important usability enhancement, I believe.

nikomatsakis commented 4 years ago

(I guess all of this interacts with the question of whether we try to create a new type, like &unsafe, that is the "blessed raw pointer" type.)

nikomatsakis commented 4 years ago

I'm curious what folks think of the immediately actionable step of creating raw pointer-based methods for atomics -- should we at least take this step and then use them to fix this particular instance of this more general issue?

RalfJung commented 4 years ago

That step is irreversible though, at least once we stabilize those methods.

An alternative that also fixes the UB here would be to stop adding "dereferencable" when UnsafeCell is present. Of course that risks people relying on this so it might be hard to take back even if we say it's not part of the lang spec (and Arc gets to rely on it solely because it is in libstd and we can adjust it with the lang spec).

nikomatsakis commented 4 years ago

@RalfJung the methods will exist, yes, but of course nobody is forced to use them (they could be deprecated).

I agree that not adding dereferenceable is also a reasonable and theoretically "withdraw-able" step, although I guess that in practice that reliance already exists anyhow (e.g., as I mentioned, I'm sure Rayon contains this same bug).

Mark-Simulacrum commented 4 years ago

I think if we had to add new methods and can't somehow alter the existing ones, either through UnsafeCell detection or an attribute, then we should hold off until we have a final design - there's a lot of methods on Atomics and duplicating them forever seems like a big deal, especially because it's plausible we'll end up with 3 versions in the long run.

RalfJung commented 4 years ago

Turns out crossbeam has a similar bug to Arc (https://github.com/crossbeam-rs/crossbeam/issues/545), and it can even be triggered deterministically. Adding an exception for UnsafeCell would have helped here as well; I am not sure how hard it would be for them to port everything to raw pointers (or a hypothetical &unsafe).

Diggsey commented 3 years ago

Is this special-casing of function arguments something we're happy with / committed to? It feels like a footgun to me, and I think that's demonstrated by how much code already exists which has this issue.

Even if an attribute is added to relax this requirement, it will be such an obscure attribute that it will remain a footgun.

Are we even sure that there would be a measurable perf regression if we had to use the more relaxed form of dereferenceable? Furthermore, given that we'll be emitting dereferenceable on all reference parameters, LLVM should still have most of the information it needs, even with the more relaxed definition:

fn foo(arg: &T) {
    bar(arg);
    // We are worried that LLVM will no longer understand that `arg` is dereferencable here?

    ...

    // However, `baz` will also have its argument marked as dereferenceable on entry to this function, so
    // can't LLVM propagate that information backwards to figure out that `arg` was always dereferenceable?
    baz(arg); 
}
RalfJung commented 3 years ago

I don't have any data on perf regressions, but I do have data on affected optimizations: without this special treatment for function arguments, there are quite a few powerful optimizations that we cannot do. When considering moving a memory access through a reference around an unknown function call, there are 6 possible cases: reading/writing mutable references and reading shared references; and each of those can be either moved up or down across an unknown function call. 4 out of these 6 cases need "protectors" to be valid optimization in Stacked Borrows, which means they need this special treatment: all 3 cases of moving an access down need this, and likewise moving a write up also needs it.

So for example, this special treatment allows us to turn

fn foo(x: &mut i32) {
  *x = 13;
  bar(); // some function we know nothing about, except that it is nounwind
  *x = 27;
}

into

fn foo(x: &mut i32) {
  bar();
  *x = 13; // and now this first write can easily be removed
  *x = 27;
}

This optimization is simply not correct if we use something like "dereferencable only on entry". The counterexample is a version of bar that colludes with the environment to get a pointer that aliases x, uses it, and then goes into an infinite loop. Essentially what happens here is that x's lifetime ends when that alias gets used, and since x is never used again due to the infinite loop, it actually is okay for its lifetime to end.

This is an extremely powerful optimization, unlike anything I have seen in the literature: it calls an unknown function, bar, with globally known memory (x) being in a different state than in the source program! But we can show that bar is unable to observe the contents stored in x. (We actually have established this in Coq.)

(We could do this optimization even when bar unwinds, but then we have to move *x = 13 down into both blocks that follow the call: the regular return block and the unwind block. This can still be worth it if it allows simplifications like the above in the return block and we don't care much about how expensive the unwind block is. But the resulting program, while easy to represent in MIR, is hard to write in surface Rust syntax.)

Diggsey commented 3 years ago

@RalfJung that makes sense: if I'm understanding correctly, this matters for function arguments and not local variables because for locals the compiler can determine whether they escape the function (and if not, the collusion you mentioned cannot happen), whereas for arguments that cannot be determined, and so dereferenceable must be part of the contract of the function?

Couldn't you have special treatment of a reference between its "last use" and when it is removed from the call-stack? After its last use it becomes a "zombie reference" which, by still existing in the stacked borrows "stack", disallows the collusion of bar with the caller, but does not actually require that the reference still be dereferenceable.

edit: I opened an issue on the UCG repo to avoid derailing this thread further.

cuviper commented 2 years ago

I managed to create an Arc::drop test that miri will complain about:

use crossbeam_channel::bounded;
use std::sync::Arc;

fn main() {
    let (sender, receiver) = bounded::<Arc<i32>>(1);

    std::thread::spawn(move || while let Ok(_) = receiver.recv() {});

    for i in 0.. {
        dbg!(i);
        let arc = Arc::new(42);
        sender.send(Arc::clone(&arc)).unwrap();
    }
}

I had to use crossbeam_channel because miri kept telling me that mpsc was deadlocked in futex_wait. But the error here is not under crossbeam, so I hope that detail doesn't really matter.

$ MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri run
[...]
[src/main.rs:10] i = 480
error: Undefined Behavior: deallocating while item is protected: [SharedReadWrite for <13233224> (call 4080208)]
   --> /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:107:14
    |
107 |     unsafe { __rust_dealloc(ptr, layout.size(), layout.align()) }
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ deallocating while item is protected: [SharedReadWrite for <13233224> (call 4080208)]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information

    = note: inside `std::alloc::dealloc` at /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:107:14
    = note: inside `<std::alloc::Global as std::alloc::Allocator>::deallocate` at /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:244:22
    = note: inside `<std::sync::Weak<i32> as std::ops::Drop>::drop` at /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/sync.rs:2175:22
    = note: inside `std::ptr::drop_in_place::<std::sync::Weak<i32>> - shim(Some(std::sync::Weak<i32>))` at /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:487:1
    = note: inside `std::mem::drop::<std::sync::Weak<i32>>` at /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:974:24
    = note: inside `std::sync::Arc::<i32>::drop_slow` at /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/sync.rs:1109:9
    = note: inside `<std::sync::Arc<i32> as std::ops::Drop>::drop` at /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/sync.rs:1702:13
    = note: inside `std::ptr::drop_in_place::<std::sync::Arc<i32>> - shim(Some(std::sync::Arc<i32>))` at /home/jistone/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:487:1
note: inside `main` at src/main.rs:13:5
   --> src/main.rs:13:5
    |
13  |     }
    |     ^

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error

I'm not sure why disable-isolation matters here -- I had just copied that from invocations I used for rayon. I haven't reached the race without that (or haven't waited long enough), but with the flag I always get the error on i = 480. :exploding_head:

$ rustc +nightly -Vv
rustc 1.63.0-nightly (420c970cb 2022-06-09)
binary: rustc
commit-hash: 420c970cb1edccbf60ff2aeb51ca01e2300b09ef
commit-date: 2022-06-09
host: x86_64-unknown-linux-gnu
release: 1.63.0-nightly
LLVM version: 14.0.5
saethlin commented 2 years ago

My instinct is that the 480 is just you getting "lucky" as a result of the impact the turning off isolation has on Miri's initial state, and the right -Zmiri-seed value might produce a much earlier protector error. I tried a few and didn't find anything that crashes faster.

I also ran this a few times without disabling isolation and it eventually deadlocks at iteration 3999. Now that doesn't make a whole lot of sense to me, but as of recently Miri emulates weak memory effects so I'm sure there are plenty of bugs out there that the weak memory emulation will find. (it could also be a Miri bug, but betting against the checker is usually a bad idea)

RalfJung commented 2 years ago

The deadlocks are strange, but the error @cuviper found is exactly what I would expect for this bug -- deallocation of an active protected shared reference.

If we weakened the requirements on deallocation so that they are equal to those on writing (which IMO makes a lot of sense), then this error would no longer arise. But that would of course mean we cannot tell LLVM that such references are dereferenceable for the entire duration of the function call.

LegionMammal978 commented 2 years ago

I ran into this bug when trying to run the tests on the desync crate under Miri. The crate uses several threads with mpsc channels for communication, and some of the tests are repeated 1000 times to account for variation. This bug reliably occurs within a few hundred iterations, making it difficult to find actual data races in the test cases.

Here's an MRE I came up with while debugging this:


/*
Run with:
  MIRIFLAGS='-Zmiri-seed=597' cargo +nightly-2022-06-10 miri run
or:
  MIRIFLAGS='-Zmiri-seed=106 -Zmiri-disable-weak-memory-emulation' cargo +nightly-2022-06-10 miri run
*/

use std::{sync::Arc, thread};

fn main() {
    let arc_1 = Arc::new(());
    let arc_2 = arc_1.clone();
    let thread = thread::spawn(|| drop(arc_2));
    let mut i = 0;
    while i < 256 {
        i += 1;
    }
    drop(arc_1);
    thread.join().unwrap();
}
Coder-256 commented 2 years ago

Perhaps fetch_sub should be changed to the underlying core::intrinsics::atomic_xsub_rel until there is a consensus on how to permanently fix this? I would be happy to open a PR. It is a bit scary that such widely-used code could possibly be unsound; even if it seems to work today, I worry that someday an upstream change to LLVM could make this trigger real UB.

RalfJung commented 2 years ago

Perhaps fetch_sub should be changed to the underlying core::intrinsics::atomic_xsub_rel until there is a consensus on how to permanently fix this?

That won't help since fetch_sub itself still takes a shared reference.

I think we should

RalfJung commented 2 years ago

Ah actually this would also lose the dereferenceable attribute on mutable references -- they can be written to, and hence they can be deallocated during the function call. So we should talk to some of our LLVM folks to figure out if removing dereferenceable for all mutable pointers is an option.

We could of course also say we do the change in Miri but leave LLVM codegen as-is, and argue that this is no worse than clang which also incorrectly emits dereferenceable when it means "dereferenceable on entry"...

RalfJung commented 2 years ago

If you are affected by this in Miri, my current recommendation would be to use -Zmiri-preemption-rate=0. That will get you back the previous behavior where Miri did not detect this problem.

RalfJung commented 2 years ago

I am doing a perf experiment at https://github.com/rust-lang/rust/pull/98017, and the LLVM implications are being discussed on Zulip.

Coder-256 commented 2 years ago

Sorry if I was unclear, I meant replacing fetch_sub with core::intrinsics::atomic_xsub_rel in the body of Arc::drop :)

SoniEx2 commented 2 years ago

We'd like to attempt a fix for this by adding an AtomicPtr to the ArcInner. It'd make Drop significantly slower in the heavy contention case, but microbenchmarks aren't real. The main benefit is that it should work.

RalfJung commented 2 years ago

The LLVM folks were concerned that losing the dereferenceable on &mut might be too high a cost. So here is a potential alternative that avoids this side-effect: we say that &UnsafeCell no longer get protectors. That means we allow any way of invalidating an &UnsafeCell argument while a function runs, either by deallocating it or by using other pointers further down the stack.

In terms of what we say LLVM, I think this does exactly what we want: we have to remove dereferenceable from &UnsafeCell, but that's it. There are some other effects, some other code that used to have UB but is now allowed, but it seems extremely unlikely that that is a problem since we already basically accept that one cannot do any special optimizations around &UnsafeCell (i.e., they optimize basically as bad as regular C pointers).

RalfJung commented 2 years ago

I have adjusted https://github.com/rust-lang/rust/pull/98017 to the proposed work-around above, and am proposing to land that PR.

SoniEx2 commented 2 years ago

To expand on our suggestion above: We don't know if it would work, but is it possible to do something akin to Once, where you coordinate Drop using stack-allocated stuff and AtomicPtr? That way, you can guarantee no thread is inside AtomicUsize or AtomicPtr but instead all threads are talking to eachother across their call stacks. (Stack allocations are "free", at least compared to heap allocations.) As such there's no need to modify any of these things or add any workarounds.