rust-lang / miri

An interpreter for Rust's mid-level intermediate representation
Apache License 2.0
4.53k stars 343 forks source link

Avoid eagerly executing unblock callback in ``unblock_thread`` #4011

Closed tiif closed 1 hour ago

tiif commented 1 day ago

In #3939, with Zmiri-preemption-rate=0 this test will deadlock on eventfd::read in main thread

fn test_read_block_fail() {
    let flags = libc::EFD_CLOEXEC;
    let fd = unsafe { libc::eventfd(0, flags) };
    let thread1 = thread::spawn(move || {
        let mut buf: [u8; 8] = [0; 8];
        let res = read_bytes(fd, &mut buf);
        assert_eq!(res, 8); 
    });
    let sized_8_data: [u8; 8] = 1_u64.to_ne_bytes();
    // Pass control to thread1 so it can block on eventfd `read`.
    thread::yield_now();
    let res = write_bytes(fd, sized_8_data);
    assert_eq!(res, 8);
    let mut buf: [u8; 8] = [0; 8];
    let res = read_bytes(fd, &mut buf); // <-- program deadlock here!
   assert_eq!(res, 8); 
    thread1.join().unwrap();
}

This mans that even with -Zmiri-preemption-rate=0, main thread is still preempted after let res = write_bytes(fd, sized_8_data); to give a return value for read_bytes (a wrapper for eventfd::read) in thread1.

EDIT: The non-block example is wrong, but the point that I want to convey is the blocking variant should continue with the read_bytes (a wrapper for eventfd::read) in main thread instead of thread1.

It is expected that the place where the deadlock occurs should be the same place where the assertion fails. But this is currently not happening because when main thread writes to eventfd after thread1 blocked on it, thread1 will be unblocked eagerly and gets a return value for eventfd::read

Additional context: https://github.com/rust-lang/miri/pull/3939#issuecomment-2439885944

Fix: Instead of eagerly executing unblock callback in unblock_thread https://github.com/rust-lang/miri/blob/11549983d96ea292d755daf3053eac3b5bf79511/src/concurrency/thread.rs#L1079 we could add a ThreadState::Unblock and only calling the unblock callback through ThreadManager::schedule.

RalfJung commented 1 day ago

we could add a ThreadState::Unblock and only calling the unblock callback through ThreadManager::schedule.

No that doesn't work. As I already said last week, the entire system is built around running the callback immediately and other users of the callback (in particular the synchronization primitives) rely on that.

oli-obk commented 1 day ago

I think the issue is us doing actually interesting logic in the callback. Just doing a few raw sync ops is different from actually performing a read or a write.

So we could either add a new kind of unblocked-thread-continue-callback or just have an empty unblock callback and instead set the mir block and statement position to the one that invoked the shim that caused the thread to block, effectively redoing the op from scratch. This is not generally correct, but for read and write it should be? Any divergence between redoing the op and finishing it would at least be very odd.

RalfJung commented 17 hours ago

but the point that I want to convey is the blocking variant should continue with the read_bytes (a wrapper for eventfd::read) in main thread instead of thread1.

I don't think I quite agree with that framing.

The equivalent example with a lock will behave similarly: if the main thread releases the lock, and another thread is waiting, it will immediately acquire the lock, even with a preemption rate of 0. That is the expected behavior of a lock -- to do an ownership transfer from the old lock holder to the first thread in the queue. Based on this we have the invariant that an unlocked lock always has an empty queue, which in turn has ramifications for things like "what if the user moves the lock to a new thread".

So, I wouldn't say that in your example, the main thread is being preempted. Rather, when the main thread does a write, that operation immediately checks the wait queue for actions that can now be performed, and performs them. If we don't do this, we'll be in a very odd state where there is a thread in the wait queue for this FD despite the FD being ready for reads! This isn't preempting the current thread in favor of another thread, it is the "kernel" (Miri) performing actions that have been queued up to be performed "immediately once a read is possible on this FD".

oli-obk commented 1 hour ago

Oh right. We're emulating a syscall, not an arbitrary c function. From the thread's perspective it is atomic

I think there's nothing to change here