rust-lang / miri

An interpreter for Rust's mid-level intermediate representation
Apache License 2.0
4.32k stars 329 forks source link

Detecting races when releasing Tree Borrows protectors? #3266

Open Imberflur opened 8 months ago

Imberflur commented 8 months ago

IIUC in tree borrows protectors generate implicit reads when they are released. However, the data race detector doesn't seem to be aware of them. Does it make sense for these to be accesses for the data race model?

This looks like a similar issue https://github.com/rust-lang/miri/issues/2648

Here is an example I tested with MIRIFLAGS="-Zmiri-tree-borrows" cargo miri run.

use core::sync::atomic::AtomicBool;
use core::sync::atomic::Ordering;
use std::alloc;

static FLAG: AtomicBool = AtomicBool::new(false);

fn protector_data_race(v: &u8) {
    println!("{v}");
    FLAG.store(true, Ordering::Release);
    // Implicit read from protector not detected as a data race(?), but uncommenting this will be
    // detected:
    //println!("{v}");
}

fn main() {
    let layout = alloc::Layout::new::<u8>();
    let ptr = unsafe { alloc::alloc(layout) };
    assert!(!ptr.is_null());
    unsafe { *ptr = 8 };

    struct SendPtr(*mut u8);
    unsafe impl Send for SendPtr {}
    let send_ptr = SendPtr(ptr);

    std::thread::spawn(|| {
        let send_ptr = send_ptr;
        let r = unsafe { &*send_ptr.0 };
        protector_data_race(r);
    });

    loop {
        if FLAG.load(Ordering::Acquire) {
            // There is an actual race since we need to delay here to prevent the dealloc from
            // running while the protector still exists (generating a different kind of error).
            //
            // But we want to preserve the race to test the race detector... so this just loops,
            // perhaps there is a better and less fragile way to do this.
            for _ in 0..100 {}
            unsafe { alloc::dealloc(ptr.cast(), layout) };
            break;
        }
    }
}

The relevant code seems to be: https://github.com/rust-lang/miri/blob/13c915bb42aab8ee4075632e2aff42e95d570b56/src/borrow_tracker/tree_borrows/mod.rs#L109 https://github.com/rust-lang/miri/blob/13c915bb42aab8ee4075632e2aff42e95d570b56/src/borrow_tracker/tree_borrows/tree.rs#L660

RalfJung commented 8 months ago

Cc @Vanille-N

Vanille-N commented 8 months ago

Oh no is this a case of "the implementation didn't support this out of the box so I didn't implement it at all" ? I recall thinking specifically about the fact that this was not a pleasant thing to have to do wrt the data race model.

Of course the first fix that comes to mind is to insert around here a data_race.read(..., perms_range, ...) but this exposes an internal range of the RangeMap to the user, which we decided against. Also it requires passing around more function arguments because there's currently no place in the code that has access to both data_race and rperms.

A possible alternative could be to give data_race a similar feature as tree with a way to do an access on an unspecified range and a per-location way of determining if it should be done on demand ?

Either way if we don't want to go back on the promise that RangeMap is an implementation detail, this will need some extra features in data_race.

RalfJung commented 7 months ago

Good point! I am not sure if it is strictly necessary to make this a race, but just for consistency with the races on function entry it would be good to do this.

this exposes an internal range of the RangeMap to the user, which we decided against.

Exposing those ranges is unfortunate but not a complete blocker. We can just have a FIXME saying that in the future we might want to somehow normalize the range we send to data_race to make it not depend on RangeMap splitting/merging.

JoJoDeveloping commented 1 month ago

Data race interaction with TB is an interesting question.

First, notice that your code already has undefined behavior since your deallocation (even just through the fact that it acts as a write, ignoring strong protector deallocation semantics) can trigger the protector protecting the argument v of protector_data_race. So from a "program semantics" point of view, this needs not be added. It only needs to be added to get reliable UB detection, since that can currently be defeated with high likelihood by busy waiting (as demonstrated in your example).

If we ignore this "busy wait bug," we can argue that data race detection is not needed for the protector end semantics: Any foreign write that could be racy is already accounted for by the fact that the existing protector ensures that there is no foreign write, and if there is a data race there is an interleaving where we can make the write happen before the protector ends. Similarly, if the protector end access is a write (see #3732), then any foreign read is already disabled.

What adding the data race here would do is also catch data races from children, which are usually unaffected by the protector end semantics. However, I personally think that data race protection should be integrated more finely with the aliasing model. In case of Tree Borrows, I would argue that it should be a data race to concurrently access data using pointers that do not have the same tag. This would allow us to add this to the protector end accesses data race detector without having to worry about children. In general, I think that this would add very little UB (for reasons similar to the above: you can usually construct an interleaving where one write happening first makes the other write impossible).

The only case where this would add UB is those involving &mut references to interior mutable types (ReservedIM), but these are weird enough already that adding slightly more UB there should not hurt. I can not think of a way this could cause UB in safe code, since &mut atomics are only ever accessed from one thread. In all the other cases, adding this rule to the data race detector only helps to make UB detection more reliable.