Epoll deadlock on tokio's test

tiif commented 1 month ago

Description

When running one of the tokio test with cargo +nightly miri test --features full --test spawn_pinned, miri reported deadlock on epoll_wait syscall. This is likely caused by epoll shim not receiving an epoll event notification to unblock the thread when it should have received one in real world system.

Version

Rustc:

rustc 1.83.0-nightly (7042c269c 2024-09-23)
binary: rustc
commit-hash: 7042c269c166191cd5d8daf0409890903df7af57
commit-date: 2024-09-23
host: x86_64-unknown-linux-gnu
release: 1.83.0-nightly
LLVM version: 19.1.0

Tokio: Tokio repo master branch commit 27539ae3

Full trace

```rust Finished `test` profile [unoptimized + debuginfo] target(s) in 0.06s Running tests/spawn_pinned.rs (target/miri/x86_64-unknown-linux-gnu/debug/deps/spawn_pinned-5d9d1b31504e33c5) running 9 tests test callback_panic_does_not_kill_worker ... error: deadlock: the evaluated program deadlocked --> /home/byt/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mio-1.0.2/src/sys/unix/selector/epoll.rs:56:9 | 56 | / syscall!(epoll_wait( 57 | | self.ep.as_raw_fd(), 58 | | events.as_mut_ptr(), 59 | | events.capacity() as i32, 60 | | timeout,d 61 | | )) | |__________^ the evaluated program deadlocked | = note: BACKTRACE on thread `unnamed-2`: = note: inside `mio::sys::unix::selector::Selector::select` at /home/byt/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mio-1.0.2/src/sys/unix/mod.rs:8:48: 8:49 = note: inside `mio::poll::Poll::poll` at /home/byt/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mio-1.0.2/src/poll.rs:435:9: 435:61 note: inside `tokio::runtime::io::driver::Driver::turn` --> /home/byt/Documents/tokio/tokio/src/runtime/io/driver.rs:149:15 | 149 | match self.poll.poll(events, max_wait) { | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::io::driver::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/io/driver.rs:122:9 | 122 | self.turn(handle, None); | ^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::signal::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/signal/mod.rs:92:9 | 92 | self.io.park(handle); | ^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::process::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/process.rs:32:9 | 32 | self.park.park(handle); | ^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::driver::IoStack::park` --> /home/byt/Documents/tokio/tokio/src/runtime/driver.rs:175:40 | 175 | IoStack::Enabled(v) => v.park(handle), | ^^^^^^^^^^^^^^ note: inside `tokio::runtime::time::Driver::park_internal` --> /home/byt/Documents/tokio/tokio/src/runtime/time/mod.rs:247:21 | 247 | self.park.park(rt_handle); | ^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::time::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/time/mod.rs:173:9 | 173 | self.park_internal(handle, None); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::driver::TimeDriver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/driver.rs:332:55 | 332 | TimeDriver::Enabled { driver, .. } => driver.park(handle), | ^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::driver::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/driver.rs:71:9 | 71 | self.inner.park(handle); | ^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:382:17 | 382 | driver.park(&handle.driver); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::scheduler::current_thread::Context::enter::<(), {closure@tokio::runtime::scheduler::current_thread::Context::park::{closure#1}}>` --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:423:19 | 423 | let ret = f(); | ^^^ note: inside `tokio::runtime::scheduler::current_thread::Context::park` --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:381:27 | 381 | let (c, ()) = self.enter(core, || { | ___________________________^ 382 | | driver.park(&handle.driver); 383 | | self.defer.wake(); 384 | | }); | |______________^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:724:33 | 724 | ... context.park(core, handle) | ^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:774:68 | 774 | let (core, ret) = context::set_scheduler(&self.context, || f(core, context)); | ^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::context::scoped::Scoped::::set::<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::enter<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on()}>>::{closure#0}}, std::option::Option<()>>::{closure#0}}, (std::boxed::Box, std::option::Option<()>)>` --> /home/byt/Documents/tokio/tokio/src/runtime/context/scoped.rs:40:9 | 40 | f() | ^^^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/context.rs:180:26 | 180 | CONTEXT.with(|c| c.scheduler.set(v, f)) | ^^^^^^^^^^^^^^^^^^^^^ = note: inside `std::thread::LocalKey::::try_with::<{closure@tokio::runtime::context::set_scheduler<(std::boxed::Box, std::option::Option<()>), {closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::enter<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on()}>>::{closure#0}}, std::option::Option<()>>::{closure#0}}>::{closure#0}}, (std::boxed::Box, std::option::Option<()>)>` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:283:12: 283:27 = note: inside `std::thread::LocalKey::::with::<{closure@tokio::runtime::context::set_scheduler<(std::boxed::Box, std::option::Option<()>), {closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::enter<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on()}>>::{closure#0}}, std::option::Option<()>>::{closure#0}}>::{closure#0}}, (std::boxed::Box, std::option::Option<()>)>` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:260:9: 260:25 note: inside `tokio::runtime::context::set_scheduler::<(std::boxed::Box, std::option::Option<()>), {closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::enter<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on()}>>::{closure#0}}, std::option::Option<()>>::{closure#0}}>` --> /home/byt/Documents/tokio/tokio/src/runtime/context.rs:180:9 | 180 | CONTEXT.with(|c| c.scheduler.set(v, f)) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::scheduler::current_thread::CoreGuard::<'_>::enter::<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on()}>>::{closure#0}}, std::option::Option<()>>` --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:774:27 | 774 | let (core, ret) = context::set_scheduler(&self.context, || f(core, context)); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:191:28 | 191 | return core.block_on(future); | ^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio_util::task::spawn_pinned::LocalWorkerHandle::run` --> /home/byt/Documents/tokio/tokio-util/src/task/spawn_pinned.rs:396:9 | 396 | / local_set.block_on(&runtime, async { 397 | | while let Some(spawn_task) = task_receiver.recv().await { 398 | | // Calls spawn_local(future) 399 | | (spawn_task)(); 400 | | } 401 | | }); | |__________^ note: inside closure --> /home/byt/Documents/tokio/tokio-util/src/task/spawn_pinned.rs:381:31 | 381 | std::thread::spawn(|| Self::run(runtime, receiver, task_count_clone)); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ = note: this error originates in the macro `syscall` (in Nightly builds, run with -Z macro-backtrace for more info) error: deadlock: the evaluated program deadlocked --> /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/pal/unix/futex.rs:75:21 | 75 | ) | ^ the evaluated program deadlocked | = note: BACKTRACE: = note: inside `std::sys::pal::unix::futex::futex_wait` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/pal/unix/futex.rs:75:21: 75:22 = note: inside `std::sys::sync::thread_parking::futex::Parker::park` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/sync/thread_parking/futex.rs:55:13: 55:50 = note: inside `std::thread::Thread::park` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:1404:18: 1404:53 = note: inside `std::thread::park` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:1110:9: 1110:25 = note: inside `std::sync::mpmc::context::Context::wait_until` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sync/mpmc/context.rs:138:17: 138:31 = note: inside closure at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sync/mpmc/list.rs:447:27: 447:50 = note: inside closure at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sync/mpmc/context.rs:49:13: 49:18 = note: inside closure at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sync/mpmc/context.rs:57:31: 57:37 = note: inside `std::thread::LocalKey::>>::try_with::<{closure@std::sync::mpmc::context::Context::with<{closure@std::sync::mpmc::list::Channel::recv::{closure#1}}, ()>::{closure#1}}, ()>` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:283:12: 283:27 = note: inside `std::sync::mpmc::context::Context::with::<{closure@std::sync::mpmc::list::Channel::recv::{closure#1}}, ()>` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sync/mpmc/context.rs:52:9: 61:15 = note: inside `std::sync::mpmc::list::Channel::::recv` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sync/mpmc/list.rs:437:13: 458:15 = note: inside `std::sync::mpmc::Receiver::::recv` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sync/mpmc/mod.rs:308:43: 308:58 = note: inside `std::sync::mpsc::Receiver::::recv` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sync/mpsc/mod.rs:848:9: 848:26 = note: inside `test::run_tests::<{closure@test::run_tests_console::{closure#2}}>` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/test/src/lib.rs:379:38: 379:47 = note: inside `test::run_tests_console` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/test/src/console.rs:322:5: 322:70 = note: inside `test::test_main` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/test/src/lib.rs:137:19: 137:59 = note: inside `test::test_main_static` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/test/src/lib.rs:159:5: 159:40 = note: inside `main` error: deadlock: the evaluated program deadlocked --> /home/byt/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mio-1.0.2/src/sys/unix/selector/epoll.rs:56:9 | 56 | / syscall!(epoll_wait( 57 | | self.ep.as_raw_fd(), 58 | | events.as_mut_ptr(), 59 | | events.capacity() as i32, 60 | | timeout, 61 | | )) | |__________^ the evaluated program deadlocked | = note: BACKTRACE on thread `callback_panic_`: = note: inside `mio::sys::unix::selector::Selector::select` at /home/byt/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mio-1.0.2/src/sys/unix/mod.rs:8:48: 8:49 = note: inside `mio::poll::Poll::poll` at /home/byt/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mio-1.0.2/src/poll.rs:435:9: 435:61 note: inside `tokio::runtime::io::driver::Driver::turn` --> /home/byt/Documents/tokio/tokio/src/runtime/io/driver.rs:149:15 | 149 | match self.poll.poll(events, max_wait) { | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::io::driver::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/io/driver.rs:122:9 | 122 | self.turn(handle, None); | ^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::signal::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/signal/mod.rs:92:9 | 92 | self.io.park(handle); | ^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::process::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/process.rs:32:9 | 32 | self.park.park(handle); | ^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::driver::IoStack::park` --> /home/byt/Documents/tokio/tokio/src/runtime/driver.rs:175:40 | 175 | IoStack::Enabled(v) => v.park(handle), | ^^^^^^^^^^^^^^ note: inside `tokio::runtime::time::Driver::park_internal` --> /home/byt/Documents/tokio/tokio/src/runtime/time/mod.rs:247:21 | 247 | self.park.park(rt_handle); | ^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::time::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/time/mod.rs:173:9 | 173 | self.park_internal(handle, None); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::driver::TimeDriver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/driver.rs:332:55 | 332 | TimeDriver::Enabled { driver, .. } => driver.park(handle), | ^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::driver::Driver::park` --> /home/byt/Documents/tokio/tokio/src/runtime/driver.rs:71:9 | 71 | self.inner.park(handle); | ^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:382:17 | 382 | driver.park(&handle.driver); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::scheduler::current_thread::Context::enter::<(), {closure@tokio::runtime::scheduler::current_thread::Context::park::{closure#1}}>` --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:423:19 | 423 | let ret = f(); | ^^^ note: inside `tokio::runtime::scheduler::current_thread::Context::park` --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:381:27 | 381 | let (c, ()) = self.enter(core, || { | ___________________________^ 382 | | driver.park(&handle.driver); 383 | | self.defer.wake(); 384 | | }); | |______________^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:724:33 | 724 | ... context.park(core, handle) | ^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:774:68 | 774 | let (core, ret) = context::set_scheduler(&self.context, || f(core, context)); | ^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::context::scoped::Scoped::::set::<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::enter<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on>>>::{closure#0}}, std::option::Option<()>>::{closure#0}}, (std::boxed::Box, std::option::Option<()>)>` --> /home/byt/Documents/tokio/tokio/src/runtime/context/scoped.rs:40:9 | 40 | f() | ^^^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/context.rs:180:26 | 180 | CONTEXT.with(|c| c.scheduler.set(v, f)) | ^^^^^^^^^^^^^^^^^^^^^ = note: inside `std::thread::LocalKey::::try_with::<{closure@tokio::runtime::context::set_scheduler<(std::boxed::Box, std::option::Option<()>), {closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::enter<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on>>>::{closure#0}}, std::option::Option<()>>::{closure#0}}>::{closure#0}}, (std::boxed::Box, std::option::Option<()>)>` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:283:12: 283:27 = note: inside `std::thread::LocalKey::::with::<{closure@tokio::runtime::context::set_scheduler<(std::boxed::Box, std::option::Option<()>), {closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::enter<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on>>>::{closure#0}}, std::option::Option<()>>::{closure#0}}>::{closure#0}}, (std::boxed::Box, std::option::Option<()>)>` at /home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:260:9: 260:25 note: inside `tokio::runtime::context::set_scheduler::<(std::boxed::Box, std::option::Option<()>), {closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::enter<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on>>>::{closure#0}}, std::option::Option<()>>::{closure#0}}>` --> /home/byt/Documents/tokio/tokio/src/runtime/context.rs:180:9 | 180 | CONTEXT.with(|c| c.scheduler.set(v, f)) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `tokio::runtime::scheduler::current_thread::CoreGuard::<'_>::enter::<{closure@tokio::runtime::scheduler::current_thread::CoreGuard<'_>::block_on>>>::{closure#0}}, std::option::Option<()>>` --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:774:27 | 774 | let (core, ret) = context::set_scheduler(&self.context, || f(core, context)); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> /home/byt/Documents/tokio/tokio/src/runtime/scheduler/current_thread/mod.rs:191:28 | 191 | return core.block_on(future); | ^^^^^^^^^^^^^^^^^^^^^ note: inside `callback_panic_does_not_kill_worker` --> tokio-util/tests/spawn_pinned.rs:120:5 | 120 | assert_eq!(result.unwrap(), "test"); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> tokio-util/tests/spawn_pinned.rs:100:47 | 98 | #[tokio::test] | -------------- in this procedural macro expansion 99 | #[cfg(panic = "unwind")] 100 | async fn callback_panic_does_not_kill_worker() { | ^ = note: this error originates in the macro `syscall` which comes from the expansion of the attribute macro `tokio::test` (in Nightly builds, run with -Z macro-backtrace for more info) note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace error: aborting due to 3 previous errors error: test failed, to rerun pass `-p tokio-util --test spawn_pinned` Caused by: process didn't exit successfully: `/home/byt/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri runner /home/byt/Documents/tokio/target/miri/x86_64-unknown-linux-gnu/debug/deps/spawn_pinned-5d9d1b31504e33c5` (exit status: 1) note: test exited abnormally; to see the full output pass --nocapture to the harness. ```

FrankReh commented 1 month ago

Adding VClock to the epoll ready_list fixes the tokio reproducer.

RalfJung commented 1 month ago

Awesome. :) Nice job identifying which part of the non-determinism is the culprit here.

FrankReh commented 1 month ago

For the PR, I was asked to include the test case described above. I could use some help in figuring out how to get the test either run or compiled. When I try to run it, it won't compile but maybe there is a different way the test should be run?

I tried

./miri test libc-epoll-blocking

but that fails because the test harness says the function didn't compile and it shows a reasonable compiler error:

--- tests/pass-dep/libc/libc-epoll-blocking.stderr
+++ <stderr output>
+error: creating a shared reference to mutable static is discouraged
+  --> tests/pass-dep/libc/libc-epoll-blocking.rs:LL:CC
+   |
+LL |     unsafe { assert_eq!(VAL, 1) };
+   |                         ^^^ shared reference to mutable static
+   |
+   = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/static-mut-references.html>
+   = note: shared references to mutable statics are dangerous; it's undefined behavior if the static is mutated or if a mutable reference is created for it while the shared reference lives
+   = note: `-D static-mut-refs` implied by `-D warnings`
+   = help: to override `-D warnings` add `#[allow(static_mut_refs)]`
+
+error: miri cannot be run on programs that fail compilation
+
+error: aborting due to 2 previous errors
+

FrankReh commented 1 month ago

Follow-up question. I found another way to run the test so now it's just a warning but then an assertion fails.

$ ./miri run --dep tests/pass-dep/libc/libc-epoll-blocking.rs
warning: creating a shared reference to mutable static is discouraged
   --> tests/pass-dep/libc/libc-epoll-blocking.rs:173:25
    |
173 |     unsafe { assert_eq!(VAL, 1) };
    |                         ^^^ shared reference to mutable static
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/static-mut-references.html>
    = note: shared references to mutable statics are dangerous; it's undefined behavior if the static is mutated or if a mutable reference is created for it while the shared reference lives
    = note: `#[warn(static_mut_refs)]` on by default

thread 'main' panicked at tests/pass-dep/libc/libc-epoll-blocking.rs:171:5:
assertion `left == right` failed: got wrong events
  left: 4
 right: 5

and here you can see what line 171 in that file is:

   142  // This test shows a data_race before epoll had vector clocks added.
   143  fn test_epoll_race() {
   144      // Create an epoll instance.
   145      let epfd = unsafe { libc::epoll_create1(0) };
   146      assert_ne!(epfd, -1);
   147  
   148      // Create an eventfd instance.
   149      let flags = libc::EFD_NONBLOCK | libc::EFD_CLOEXEC;
   150      let fd = unsafe { libc::eventfd(0, flags) };
   151  
   152      // Register eventfd with the epoll instance.
   153      let mut ev = libc::epoll_event { events: EPOLL_IN_OUT_ET, u64: fd as u64 };
   154      let res = unsafe { libc::epoll_ctl(epfd, libc::EPOLL_CTL_ADD, fd, &mut ev) };
   155      assert_eq!(res, 0);
   156  
   157      static mut VAL: u8 = 0;
   158      let thread1 = thread::spawn(move || {
   159          // Write to the static mut variable.
   160          unsafe { VAL = 1 };
   161          // Write to the eventfd instance.
   162          let sized_8_data: [u8; 8] = 1_u64.to_ne_bytes();
   163          let res = unsafe { libc::write(fd, sized_8_data.as_ptr() as *const libc::c_void, 8) };
   164          // read returns number of bytes that have been read, which is always 8.
   165          assert_eq!(res, 8);
   166      });
   167      thread::yield_now();
   168      // epoll_wait for the event to happen.
   169      let expected_event = u32::try_from(libc::EPOLLIN | libc::EPOLLOUT).unwrap();
   170      let expected_value = u64::try_from(fd).unwrap();
   171      check_epoll_wait::<8>(epfd, &[(expected_event, expected_value)], -1);
   172      // Read from the static mut variable.
   173      unsafe { assert_eq!(VAL, 1) };
   174      thread1.join().unwrap();
   175  }

If I run it as ./miri run --dep ./tests/pass-dep/libc/libc-epoll-blocking.rs -Zmiri-preemption-rate=0 then the test passes and there is just the warning about the shared reference to mutable static.

Is this the normal way to run tests and should I just commit this to the PR as is? The PR's CI isn't going to flag the warning, the assertion, or the compile error?

oli-obk commented 1 month ago

you can grep for -Zmiri-preemption-rate to see how other tests add that flag.

the warnings can be disabled with the attribute mentioned in the diagnostic message

FrankReh commented 1 month ago

Under what conditions do the //@compile-flags: kick in? I see the relevant one already in the file under test. But I also see a comment to indicate what test function it is needed for on the line above it and if I remove the lines completely, only the new/last function asserts a failure, not the one the comments indicated the flag was there for.

So why does putting the flag on the command line to miri work but not the comment in the header? I find dozens of test files with similar lines in their header.

Here is what the file already has:

//@only-target: linux
// test_epoll_block_then_unblock depends on a deterministic schedule.
//@compile-flags: -Zmiri-preemption-rate=0

I found the ui_test docs talk about this comment flag, but it doesn't help me understand why passing the -Z flag on the test command line makes it pass but not having it on the command line but in the test file, like dozens of others, and only this new test function fails?

FrankReh commented 1 month ago

Well, as soon as I posted that, I thought of something else to try.

Just ./miri test and the tests pass. But only if that @compile-flags is in the file. without that compile-flags line, the whole test reports the new epoll test function fails, but not the other function, maybe due to a different seed?

But when the test is run with a FILTER, either as a test or a run, the flag has to be provided on the command-line. Was this advice somewhere I should have been reading first - it would probably be in the ui_test docs but I don't see it? Or am I still misinterpreting things and the results are explained some other way? 😕

tiif commented 1 month ago

There are two ways to run the test in miri, one is ./miri test, another one is ./miri run. ./miri test will understand the @compile-flags (probably because this command is using ui_test) . In ./miri run, I need to pass all the flags myself, and it behaves like running miri normally on a program, it won't give me hint to bless the test if it fails, it will just provide the normal output. Usually if ./miri test pass, it will pass on CI.

There is a brief introduction about testing in https://github.com/rust-lang/miri/blob/master/CONTRIBUTING.md#building-and-testing-miri

If you are unsure, you can push the change to the PR first, and we can help to check what's wrong by reading the CI log.

oli-obk commented 1 month ago

Thanks for debugging this, finding a fix and opening a PR! This was definitely very unexpected from just looking at the original issue

RalfJung commented 1 month ago

I am wondering if there are API things we can do to reduce the likelihood of this. Thread unblocking almost always occurs because of some other event an another thread that we should register a causal link with, so maybe the thread unblocking function should take a clock. It's just we've so far kept those subsystems largely orthogonal.

rust-lang / miri

Epoll deadlock on tokio's test #3911