tokio-rs / loom

Concurrency permutation testing tool for Rust.
MIT License
2.13k stars 111 forks source link

panicked while panicking in loom::sync::arc::Arc::drop after deadlock in code under test #291

Open sporksmith opened 1 year ago

sporksmith commented 1 year ago

Hi, I'm trying to get started testing a custom mutex with loom. I have a draft PR here, where you can see the loom tests, and some functions I had to stub out (e.g. futex). https://github.com/shadow/shadow/pull/2553/files#diff-c8d75692cd8a476ff7f5af73e87d2d3567440ad8190c8bbad0e6493fa223b1da

First I run our build wrapper to pre-generate a header file:

./setup build

Then I'm trying to run loom with:

$ LOOM_CHECKPOINT_FILE=my_test.json RUST_BACKTRACE=full RUSTFLAGS="--cfg loom" cargo test --manifest-path=src/Cargo.toml -p shadow-shim-helper-rs --test loom_tests --release

I get:

...
running 2 tests
test test_basic ... ok
thread panicked while panicking. aborting.
error: test failed, to rerun pass `-p shadow-shim-helper-rs --test loom_tests`

Caused by:
  process didn't exit successfully: `/home/jnewsome/projects/shadow/dev/src/target/release/deps/loom_tests-73d5a34e9486c4e8` (signal: 6, SIGABRT: process abort signal)

If I run the test directly under gdb I'm able to get a backtrace:

$ gdb /home/jnewsome/projects/shadow/dev/src/target/release/deps/loom_tests-73d5a34e9486c4e8
(gdb) run
Thread 3 "test_threads" received signal SIGABRT, Aborted.
[Switching to LWP 2178864]
0x00007ffff7d01a7c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff7d01a7c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7cad476 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff7c937f3 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00005555556ac627 in std::sys::unix::abort_internal () at library/std/src/sys/unix/mod.rs:317
#4  0x00005555556a6c21 in std::panicking::rust_panic_with_hook () at library/std/src/panicking.rs:713
#5  0x00005555556a68c1 in std::panicking::begin_panic_handler::{closure#0} () at library/std/src/panicking.rs:586
#6  0x00005555556a4adc in std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::{closure_env#0}, !> () at library/std/src/sys_common/backtrace.rs:138
#7  0x00005555556a6622 in std::panicking::begin_panic_handler () at library/std/src/panicking.rs:584
#8  0x00005555555a12b3 in core::panicking::panic_fmt () at library/core/src/panicking.rs:142
#9  0x00005555555a117d in core::panicking::panic () at library/core/src/panicking.rs:48
#10 0x0000555555604fc0 in loom::rt::object::Ref<T>::set_action ()
#11 0x000055555560a7c6 in scoped_tls::ScopedKey<T>::with ()
#12 0x00005555555fd6f7 in loom::rt::branch ()
#13 0x00005555555fcd4d in loom::rt::arc::Arc::ref_dec ()
#14 0x00005555555b33ac in <loom::sync::arc::Arc<T> as core::ops::drop::Drop>::drop ()
#15 0x00005555555aa473 in core::ptr::drop_in_place<loom::sync::arc::Arc<shadow_shim_helper_rs::scmutex::SelfContainedMutex<i32>>> ()
#16 0x00005555555aa085 in core::ops::function::FnOnce::call_once{{vtable.shim}} ()
#17 0x00005555555f87fd in generator::stack::StackBox$LT$F$GT$::call_once::h635448abd57c58fa ()
#18 0x000055555567ce71 in generator::gen_impl::gen_init ()
#19 0x0000000000000000 in ?? ()

It looks like loom::sync::Arc's drop implmentation is panicking...?

Here's a noisier / more-informative backtrace from a non-release build:

#0  0x00007ffff7d01a7c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7cad476 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff7c937f3 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x000055555587e427 in std::sys::unix::abort_internal () at library/std/src/sys/unix/mod.rs:317
#4  0x0000555555878a21 in std::panicking::rust_panic_with_hook () at library/std/src/panicking.rs:713
#5  0x00005555558786c1 in std::panicking::begin_panic_handler::{closure#0} () at library/std/src/panicking.rs:586
#6  0x00005555558768dc in std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::{closure_env#0}, !> () at library/std/src/sys_common/backtrace.rs:138
#7  0x0000555555878422 in std::panicking::begin_panic_handler () at library/std/src/panicking.rs:584
#8  0x00005555555c31b3 in core::panicking::panic_fmt () at library/core/src/panicking.rs:142
#9  0x00005555555c307d in core::panicking::panic () at library/core/src/panicking.rs:48
#10 0x000055555581be1d in core::option::Option<usize>::unwrap<usize> (self=...) at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/option.rs:775
#11 0x00005555556767bd in loom::rt::thread::Set::active_mut (self=0x7ffff7a61880) at src/rt/thread.rs:259
#12 0x00005555556615c8 in loom::rt::object::Ref<loom::rt::arc::State>::set_action<loom::rt::arc::State> (self=..., execution=0x7ffff7a61850, action=..., location=...) at src/rt/object.rs:387
#13 0x0000555555660a8e in loom::rt::object::{impl#8}::branch_action::{closure#0}<loom::rt::arc::State, loom::rt::arc::Action> (execution=0x7ffff7a61850) at src/rt/object.rs:354
#14 0x000055555564e667 in loom::rt::branch::{closure#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()> (execution=0x7ffff7a61850)
    at src/rt/mod.rs:119
#15 0x00005555556477ca in loom::rt::scheduler::{impl#0}::with_execution::{closure#0}<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)> (state=0x7ffff7a611a0) at src/rt/scheduler.rs:48
#16 0x000055555564abed in loom::rt::scheduler::{impl#0}::with_state::{closure#0}<loom::rt::scheduler::{impl#0}::with_execution::{closure_env#0}<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)>, ((), bool)> (state=0x7ffff7a61198) at src/rt/scheduler.rs:130
#17 0x0000555555657054 in scoped_tls::ScopedKey<core::cell::RefCell<loom::rt::scheduler::State>>::with<core::cell::RefCell<loom::rt::scheduler::State>, loom::rt::scheduler::{impl#0}::with_state::{closure_env#0}<loom::rt::scheduler::{impl#0}::with_execution::{closure_env#0}<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)>, ((), bool)>, ((), bool)> (self=0x55555599ed28 <loom::rt::scheduler::STATE>, f=...) at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/scoped-tls-1.0.1/src/lib.rs:171
#18 0x0000555555649bb0 in loom::rt::scheduler::Scheduler::with_state<loom::rt::scheduler::{impl#0}::with_execution::{closure_env#0}<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)>, ((), bool)> (f=...) at src/rt/scheduler.rs:130
#19 0x00005555556471d9 in loom::rt::scheduler::Scheduler::with_execution<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)> (f=...) at src/rt/scheduler.rs:48
#20 0x0000555555653769 in loom::rt::execution<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)> (
    f=...) at src/rt/mod.rs:171
#21 0x000055555564d769 in loom::rt::branch<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()> (f=...) at src/rt/mod.rs:118
#22 0x000055555565fa98 in loom::rt::object::Ref<loom::rt::arc::State>::branch_action<loom::rt::arc::State, loom::rt::arc::Action> (self=..., action=loom::rt::arc::Action::RefDec, location=...)
    at src/rt/object.rs:351
#23 0x0000555555645180 in loom::rt::arc::Arc::branch (self=0x7fffe800f980, action=loom::rt::arc::Action::RefDec, location=...) at src/rt/arc.rs:160
#24 0x0000555555644887 in loom::rt::arc::Arc::ref_dec (self=0x7fffe800f980, location=...) at src/rt/arc.rs:113
#25 0x00005555555da79e in loom::sync::arc::{impl#4}::drop<shadow_shim_helper_rs::scmutex::SelfContainedMutex<i32>> (self=0x7ffff7fb17b8)
    at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/sync/arc.rs:246
#26 0x00005555555d94aa in core::ptr::drop_in_place<loom::sync::arc::Arc<shadow_shim_helper_rs::scmutex::SelfContainedMutex<i32>>> ()
    at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ptr/mod.rs:487
#27 0x00005555555e04a0 in loom_tests::test_threads::{closure#0} () at lib/shadow-shim-helper-rs/tests/loom_tests.rs:39
#28 0x00005555555ca7f4 in loom::model::{impl#0}::check::{closure#0}<loom_tests::test_threads::{closure_env#0}> ()
    at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/model.rs:185
#29 0x00005555555d9301 in core::ops::function::FnOnce::call_once<loom::model::{impl#0}::check::{closure_env#0}<loom_tests::test_threads::{closure_env#0}>, ()> ()
    at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248
#30 0x000055555563067d in alloc::boxed::{impl#44}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> (self=..., args=())
    at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/alloc/src/boxed.rs:1940
#31 0x000055555564d468 in loom::rt::scheduler::spawn_threads::{closure#0}::{closure#0} () at src/rt/scheduler.rs:149
#32 0x0000555555642e7d in generator::gen_impl::{impl#9}::init_code::{closure#0}<core::option::Option<alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>>, (), loom::rt::scheduler::spawn_threads::{closure#0}::{closure_env#0}> () at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/gen_impl.rs:344
#33 0x000055555563646e in generator::stack::StackBox<generator::gen_impl::{impl#9}::init_code::{closure_env#0}<core::option::Option<alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>>, (), loom::rt::scheduler::spawn_threads::{closure#0}::{closure_env#0}>>::call_once<generator::gen_impl::{impl#9}::init_code::{closure_env#0}<core::option::Option<alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>>, (), loom::rt::scheduler::spawn_threads::{closure#0}::{closure_env#0}>> (data=0x7ffff7fb1e80)
    at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/stack/mod.rs:139
#34 0x0000555555834fcc in generator::stack::Func::call_once (self=...) at src/stack/mod.rs:121
#35 0x0000555555834c33 in generator::gen_impl::gen_init::{closure#0} () at src/gen_impl.rs:560
#36 0x0000555555834423 in core::ops::function::FnOnce::call_once<generator::gen_impl::gen_init::{closure_env#0}, ()> ()
    at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248
#37 0x0000555555838bd1 in std::panicking::try::do_call<generator::gen_impl::gen_init::{closure_env#0}, ()> (data=0x7ffff7fb1d58)
    at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:492
#38 0x0000555555838c7b in __rust_try ()
#39 0x0000555555838b2f in std::panicking::try<(), generator::gen_impl::gen_init::{closure_env#0}> (f=...) at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:456
#40 0x00005555558385ff in std::panic::catch_unwind<generator::gen_impl::gen_init::{closure_env#0}, ()> (f=...) at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panic.rs:137
--Type <RET> for more, q to quit, c to continue without paging--
#41 0x000055555583498d in generator::gen_impl::catch_unwind_filter<generator::gen_impl::gen_init::{closure_env#0}, ()> (f=...) at src/gen_impl.rs:551
#42 0x0000555555834b47 in generator::gen_impl::gen_init (f=0x7ffff7fb1fa0) at src/gen_impl.rs:578
#43 0x0000000000000000 in ?? ()
sporksmith commented 1 year ago

Dug through the backtrace a bit. It looks like one of the panics originates in the unwrap in #11 0x00005555556767bd in loom::rt::thread::Set::active_mut (self=0x7ffff7a61880) at src/rt/thread.rs:259:

impl Set {
...
    pub(crate) fn active_mut(&mut self) -> &mut Thread {
        &mut self.threads[self.active.unwrap()]
    }

I'm not sure where the second panic is coming from, or whether that one is the first or second panic. I tried setting a breakpoint on panic in gdb, but still only see this panic.

sporksmith commented 1 year ago

Ok, I think I've boiled it down to a much smaller repro:

Starting with a fresh cargo project:

$ cargo init
$ cargo add loom

Then in tests/loom-test.rs:

use loom::{
    sync::{
        Arc,
        Notify,
    },
    thread,
};

#[test]
fn repro() {
    loom::model(|| {
        let notify= Arc::new(Notify::new());

        let waiter = {
            let notify = notify.clone();
            thread::spawn(move || {
                notify.wait();
            })
        };

        waiter.join().unwrap();
    });
}

and repro:

$ RUSTFLAGS="--cfg loom" cargo test --test loom-test
    Finished test [unoptimized + debuginfo] target(s) in 0.02s
     Running tests/loom-test.rs (target/debug/deps/loom_test-ed2e0ccf4660f873)

running 1 test
thread panicked while panicking. aborting.
error: test failed, to rerun pass `--test loom-test`

Caused by:
  process didn't exit successfully: `/home/jnewsome/tmp/loomtest/target/debug/deps/loom_test-ed2e0ccf4660f873` (signal: 6, SIGABRT: process abort signal)

And via gdb, we have a similar backtrace as before:

#0  0x00007ffff7d01a7c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7cad476 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff7c937f3 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x0000555555841637 in std::sys::unix::abort_internal () at library/std/src/sys/unix/mod.rs:347
#4  0x000055555583bd2c in std::panicking::rust_panic_with_hook () at library/std/src/panicking.rs:703
#5  0x000055555583b9d1 in std::panicking::begin_panic_handler::{closure#0} () at library/std/src/panicking.rs:577
#6  0x0000555555839dec in std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::{closure_env#0}, !> () at library/std/src/sys_common/backtrace.rs:137
#7  0x000055555583b732 in std::panicking::begin_panic_handler () at library/std/src/panicking.rs:575
#8  0x00005555555be803 in core::panicking::panic_fmt () at library/core/src/panicking.rs:65
#9  0x00005555555be8dd in core::panicking::panic () at library/core/src/panicking.rs:115
#10 0x00005555557a06a5 in core::option::Option<usize>::unwrap<usize> (self=...) at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/option.rs:778
#11 0x0000555555646b2d in loom::rt::thread::Set::active_mut (self=0x7ffff7c62840) at src/rt/thread.rs:259
#12 0x000055555563ca18 in loom::rt::object::Ref<loom::rt::arc::State>::set_action<loom::rt::arc::State> (self=..., execution=0x7ffff7c62810, action=..., location=...) at src/rt/object.rs:387
#13 0x000055555563c69b in loom::rt::object::{impl#8}::branch_action::{closure#0}<loom::rt::arc::State, loom::rt::arc::Action> (execution=0x7ffff7c62810) at src/rt/object.rs:354
#14 0x000055555562baa7 in loom::rt::branch::{closure#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()> (execution=0x7ffff7c62810) at src/rt/mod.rs:119
#15 0x00005555556281ba in loom::rt::scheduler::{impl#0}::with_execution::{closure#0}<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)> (state=0x7ffff7c62160)
    at src/rt/scheduler.rs:48
#16 0x0000555555629cdd in loom::rt::scheduler::{impl#0}::with_state::{closure#0}<loom::rt::scheduler::{impl#0}::with_execution::{closure_env#0}<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)>, ((), bool)> (state=0x7ffff7c62158) at src/rt/scheduler.rs:130
#17 0x000055555562df54 in scoped_tls::ScopedKey<core::cell::RefCell<loom::rt::scheduler::State>>::with<core::cell::RefCell<loom::rt::scheduler::State>, loom::rt::scheduler::{impl#0}::with_state::{closure_env#0}<loom::rt::scheduler::{impl#0}::with_execution::{closure_env#0}<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)>, ((), bool)>, ((), bool)> (self=0x55555594e740 <loom::rt::scheduler::STATE>, f=...)
    at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/scoped-tls-1.0.1/src/lib.rs:171
#18 0x0000555555628eb0 in loom::rt::scheduler::Scheduler::with_state<loom::rt::scheduler::{impl#0}::with_execution::{closure_env#0}<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)>, ((), bool)> (f=...) at src/rt/scheduler.rs:130
#19 0x0000555555627f39 in loom::rt::scheduler::Scheduler::with_execution<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)> (f=...) at src/rt/scheduler.rs:48
#20 0x000055555562c959 in loom::rt::execution<loom::rt::branch::{closure_env#0}<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()>, ((), bool)> (f=...) at src/rt/mod.rs:171
#21 0x000055555562ac19 in loom::rt::branch<loom::rt::object::{impl#8}::branch_action::{closure_env#0}<loom::rt::arc::State, loom::rt::arc::Action>, ()> (f=...) at src/rt/mod.rs:118
#22 0x000055555563b6c8 in loom::rt::object::Ref<loom::rt::arc::State>::branch_action<loom::rt::arc::State, loom::rt::arc::Action> (self=..., action=loom::rt::arc::Action::RefDec, location=...) at src/rt/object.rs:351
#23 0x0000555555626a20 in loom::rt::arc::Arc::branch (self=0x7ffff000f9e0, action=loom::rt::arc::Action::RefDec, location=...) at src/rt/arc.rs:160
#24 0x0000555555626117 in loom::rt::arc::Arc::ref_dec (self=0x7ffff000f9e0, location=...) at src/rt/arc.rs:113
#25 0x00005555555d32ad in loom::sync::arc::{impl#4}::drop<loom::sync::notify::Notify> (self=0x7ffff7fab908) at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/sync/arc.rs:246
#26 0x00005555555d2c3a in core::ptr::drop_in_place<loom::sync::arc::Arc<loom::sync::notify::Notify>> () at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/ptr/mod.rs:490
#27 0x00005555555d2dca in core::ptr::drop_in_place<loom_test::repro::{closure#0}::{closure_env#0}> () at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/ptr/mod.rs:490
#28 0x00005555555d3848 in loom_test::repro::{closure#0}::{closure#0} () at tests/loom-test.rs:18
#29 0x00005555555d3c94 in loom::thread::spawn_internal::{closure#0}<loom_test::repro::{closure#0}::{closure_env#0}, ()> () at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/thread.rs:161
#30 0x00005555555c3550 in loom::rt::spawn::{closure#1}<loom::thread::spawn_internal::{closure_env#0}<loom_test::repro::{closure#0}::{closure_env#0}, ()>> () at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/mod.rs:76
#31 0x00005555555d213e in core::ops::function::FnOnce::call_once<loom::rt::spawn::{closure_env#1}<loom::thread::spawn_internal::{closure_env#0}<loom_test::repro::{closure#0}::{closure_env#0}, ()>>, ()> ()
    at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/ops/function.rs:251
#32 0x000055555563faed in alloc::boxed::{impl#45}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> (self=..., args=()) at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/alloc/src/boxed.rs:1987
#33 0x000055555562a988 in loom::rt::scheduler::spawn_threads::{closure#0}::{closure#0} () at src/rt/scheduler.rs:149
#34 0x000055555562470d in generator::gen_impl::{impl#9}::init_code::{closure#0}<core::option::Option<alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>>, (), loom::rt::scheduler::spawn_threads::{closure#0}::{closure_env#0}> ()
    at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/gen_impl.rs:344
#35 0x000055555561a4ce in generator::stack::StackBox<generator::gen_impl::{impl#9}::init_code::{closure_env#0}<core::option::Option<alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>>, (), loom::rt::scheduler::spawn_threads::{closure#0}::{closure_env#0}>>::call_once<generator::gen_impl::{impl#9}::init_code::{closure_env#0}<core::option::Option<alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>>, (), loom::rt::scheduler::spawn_threads::{closure#0}::{closure_env#0}>> (
    data=0x7ffff7fabe80) at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/stack/mod.rs:139
#36 0x00005555557f96bc in generator::stack::Func::call_once (self=...) at src/stack/mod.rs:121
#37 0x00005555557fb4a3 in generator::gen_impl::gen_init::{closure#0} () at src/gen_impl.rs:560
#38 0x00005555557fa943 in core::ops::function::FnOnce::call_once<generator::gen_impl::gen_init::{closure_env#0}, ()> () at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/ops/function.rs:251
#39 0x00005555557fa780 in std::panicking::try::do_call<generator::gen_impl::gen_init::{closure_env#0}, ()> (data=0x7ffff7fabd58) at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:483
#40 0x00005555557fae0b in __rust_try ()
#41 0x00005555557fa6df in std::panicking::try<(), generator::gen_impl::gen_init::{closure_env#0}> (f=...) at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:447
#42 0x00005555557fd28f in std::panic::catch_unwind<generator::gen_impl::gen_init::{closure_env#0}, ()> (f=...) at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panic.rs:137
#43 0x00005555557fb1fd in generator::gen_impl::catch_unwind_filter<generator::gen_impl::gen_init::{closure_env#0}, ()> (f=...) at src/gen_impl.rs:551
#44 0x00005555557fb3b7 in generator::gen_impl::gen_init (f=0x7ffff7fabfa0) at src/gen_impl.rs:578
#45 0x0000000000000000 in ?? ()

Obviously there's a bug in the test code: the waiting thread is never woken up. Presumably there's some analogous bug in my original larger code, but I can't debug it since loom aborts before generating a checkpoint or other useful output.

sporksmith commented 1 year ago

If I get rid of the Arc, and instead just create the Notify object inside the thread, loom correctly reports the deadlock rather than double-panicking. So it seems to be some bad interaction between Arc's Drop implementation and deadlock...

use loom::{
    sync::{
        Notify,
    },
    thread,
};

#[test]
fn repro() {
    loom::model(|| {
        let waiter = {
            thread::spawn(move || {
                let notify = Notify::new();
                notify.wait();
            })
        };

        waiter.join().unwrap();
    });
}
$ RUSTFLAGS="--cfg loom" cargo test --test loom-test
   Compiling loomtest v0.1.0 (/home/jnewsome/tmp/loomtest)
    Finished test [unoptimized + debuginfo] target(s) in 0.87s
     Running tests/loom-test.rs (target/debug/deps/loom_test-ed2e0ccf4660f873)

running 1 test
test repro ... FAILED

failures:

---- repro stdout ----
thread 'repro' panicked at 'deadlock; threads = [(Id(0), Blocked(Location(None))), (Id(1), Blocked(Location(None)))]', /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/execution.rs:215:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

failures:
    repro

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass `--test loom-test`
Darksonn commented 1 year ago

What happens if you pass --nocapture to the test that double panics?

RUSTFLAGS="--cfg loom" cargo test --test loom-test -- --nocapture
sporksmith commented 1 year ago

Ah thanks - that gives me the first panic message and a backtrace.

The first panic message tells me there's a deadlock, but seems to be missing the thread Locations... presumably an orthogonal issue, though if you know how to improve it I'd greatly appreciate it to help debug my real deadlock in my original code. EDIT - nevermind; LOOM_LOCATION and LOOM_LOG WAI, though LOOM_CHECKPOINT_FILE doesn't, presumably because of the 2nd panic before writing it out.

thread 'repro' panicked at 'deadlock; threads = [(Id(0), Blocked(Location(None))), (Id(1), Blocked(Location(None)))]'

The backtrace looks very similar to what I got via gdb. Here's the full output:

$ RUSTFLAGS="--cfg loom" cargo test --test loom-test -- --nocapture
   Compiling loomtest v0.1.0 (/home/jnewsome/tmp/loomtest)
    Finished test [unoptimized + debuginfo] target(s) in 1.99s
     Running tests/loom-test.rs (target/debug/deps/loom_test-ed2e0ccf4660f873)

running 1 test
thread 'repro' panicked at 'deadlock; threads = [(Id(0), Blocked(Location(None))), (Id(1), Blocked(Location(None)))]', /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/execution.rs:215:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'repro' panicked at 'called `Option::unwrap()` on a `None` value', /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/thread.rs:259:39
stack backtrace:
   0:     0x5618e248a890 - std::backtrace_rs::backtrace::libunwind::trace::h10a3454512d6724c
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x5618e248a890 - std::backtrace_rs::backtrace::trace_unsynchronized::h60c77ea7cb0df229
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x5618e248a890 - std::sys_common::backtrace::_print_fmt::h01643809be5392e9
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/sys_common/backtrace.rs:65:5
   3:     0x5618e248a890 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hc6486d7b214e20ee
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x5618e24afe1e - core::fmt::write::h396ced1895c2ef11
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/fmt/mod.rs:1209:17
   5:     0x5618e2487535 - std::io::Write::write_fmt::haaecb11edc216834
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/io/mod.rs:1682:15
   6:     0x5618e248a655 - std::sys_common::backtrace::_print::h3eebaa973f8e3926
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/sys_common/backtrace.rs:47:5
   7:     0x5618e248a655 - std::sys_common::backtrace::print::h12da160db56dea6e
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/sys_common/backtrace.rs:34:9
   8:     0x5618e248c4cf - std::panicking::default_hook::{{closure}}::h271b878f2b03712b
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:267:22
   9:     0x5618e248c20a - std::panicking::default_hook::he0bd4fb3722ee06f
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:286:9
  10:     0x5618e244e540 - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::h8accfae86eb02ed1
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/alloc/src/boxed.rs:2001:9
  11:     0x5618e244c2cd - generator::gen_impl::catch_unwind_filter::{{closure}}::{{closure}}::h4676550068910a16
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/gen_impl.rs:547:13
  12:     0x5618e248cbed - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::haf66e02c660cb96a
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/alloc/src/boxed.rs:2001:9
  13:     0x5618e248cbed - std::panicking::rust_panic_with_hook::h0e2a2b7a9776d76b
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:692:13
  14:     0x5618e248c921 - std::panicking::begin_panic_handler::{{closure}}::ha33eff9b02266b02
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:577:13
  15:     0x5618e248ad3c - std::sys_common::backtrace::__rust_end_short_backtrace::h35f88c64c7aceda7
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/sys_common/backtrace.rs:137:18
  16:     0x5618e248c682 - rust_begin_unwind
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:575:5
  17:     0x5618e220f803 - core::panicking::panic_fmt::h940cb25cf018faef
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/panicking.rs:65:14
  18:     0x5618e220f8dd - core::panicking::panic::hf0876db436e55c60
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/panicking.rs:115:5
  19:     0x5618e23f15f5 - core::option::Option<T>::unwrap::h2fd5c605f242a393
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/option.rs:778:21
  20:     0x5618e2297a7d - loom::rt::thread::Set::active_mut::hc43604150b1cf369
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/thread.rs:259:27
  21:     0x5618e228d968 - loom::rt::object::Ref<T>::set_action::hd6c685ee3a5a70a6
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/object.rs:387:9
  22:     0x5618e228d5eb - loom::rt::object::Ref<T>::branch_action::{{closure}}::h24c22c75b3d98d80
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/object.rs:354:13
  23:     0x5618e227c9f7 - loom::rt::branch::{{closure}}::hbe744134b76a11b7
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/mod.rs:119:19
  24:     0x5618e227910a - loom::rt::scheduler::Scheduler::with_execution::{{closure}}::h025c7a60dd40af06
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/scheduler.rs:48:34
  25:     0x5618e227ac2d - loom::rt::scheduler::Scheduler::with_state::{{closure}}::h89d4367b0d02241c
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/scheduler.rs:130:28
  26:     0x5618e227eea4 - scoped_tls::ScopedKey<T>::with::h2f05b1f81e3f5f12
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/scoped-tls-1.0.1/src/lib.rs:171:13
  27:     0x5618e2279e00 - loom::rt::scheduler::Scheduler::with_state::h8817cf8bdd269312
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/scheduler.rs:130:9
  28:     0x5618e2278e89 - loom::rt::scheduler::Scheduler::with_execution::h474f1f5163eadf92
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/scheduler.rs:48:9
  29:     0x5618e227d8a9 - loom::rt::execution::hacc9b7ff62759614
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/mod.rs:171:5
  30:     0x5618e227bb69 - loom::rt::branch::h4a968391b66609c5
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/mod.rs:118:25
  31:     0x5618e228c618 - loom::rt::object::Ref<T>::branch_action::hac77a0944dbe9791
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/object.rs:351:9
  32:     0x5618e2277970 - loom::rt::arc::Arc::branch::hb11fc81fd9d505e1
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/arc.rs:160:9
  33:     0x5618e2277067 - loom::rt::arc::Arc::ref_dec::h869ba424691b71d7
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/arc.rs:113:9
  34:     0x5618e222424d - <loom::sync::arc::Arc<T> as core::ops::drop::Drop>::drop::h4964db31654a1abf
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/sync/arc.rs:246:12
  35:     0x5618e2223ada - core::ptr::drop_in_place<loom::sync::arc::Arc<()>>::ha808b3050ece2afb
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/ptr/mod.rs:490:1
  36:     0x5618e22247e2 - loom_test::repro::{{closure}}::{{closure}}::h38e4232a6df14b36
                               at /home/jnewsome/tmp/loomtest/tests/loom-test.rs:19:13
  37:     0x5618e2224c34 - loom::thread::spawn_internal::{{closure}}::h931e9954ff676618
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/thread.rs:161:47
  38:     0x5618e2214520 - loom::rt::spawn::{{closure}}::hd0988ff805d2a117
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/mod.rs:76:9
  39:     0x5618e22230be - core::ops::function::FnOnce::call_once{{vtable.shim}}::h5136e5a97bc1d594
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/ops/function.rs:251:5
  40:     0x5618e2290a3d - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h3d2a2efe50cf74d8
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/alloc/src/boxed.rs:1987:9
  41:     0x5618e227b8d8 - loom::rt::scheduler::spawn_threads::{{closure}}::{{closure}}::hcc72cf4838aec22f
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.5.6/src/rt/scheduler.rs:149:21
  42:     0x5618e227565d - generator::gen_impl::GeneratorImpl<A,T>::init_code::{{closure}}::hf97b73f155de99ef
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/gen_impl.rs:344:21
  43:     0x5618e226b41e - generator::stack::StackBox<F>::call_once::h835209945d4670a5
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/stack/mod.rs:139:13
  44:     0x5618e244a60c - generator::stack::Func::call_once::h89029c258cb93f95
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/stack/mod.rs:121:9
  45:     0x5618e244c3f3 - generator::gen_impl::gen_init::{{closure}}::h43b98fd9d4661966
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/gen_impl.rs:560:9
  46:     0x5618e244b893 - core::ops::function::FnOnce::call_once::h27bf596fb84ae7de
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/core/src/ops/function.rs:251:5
  47:     0x5618e244b6d0 - std::panicking::try::do_call::h402c64b10d78a663
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:483:40
  48:     0x5618e244bd5b - __rust_try
  49:     0x5618e244b62f - std::panicking::try::hda47dd4157e710a2
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panicking.rs:447:19
  50:     0x5618e244e1df - std::panic::catch_unwind::h052809320a3b1ae2
                               at /rustc/edf0182213a9e30982eb34f3925ddc4cf5ed3471/library/std/src/panic.rs:137:14
  51:     0x5618e244c14d - generator::gen_impl::catch_unwind_filter::h6808b038cf23cfd3
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/gen_impl.rs:551:5
  52:     0x5618e244c307 - generator::gen_impl::gen_init::h5fcf793a5e0ad37a
                               at /home/jnewsome/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.7.1/src/gen_impl.rs:578:25
  53:                0x0 - <unknown>
thread panicked while panicking. aborting.
error: test failed, to rerun pass `--test loom-test`

Caused by:
  process didn't exit successfully: `/home/jnewsome/tmp/loomtest/target/debug/deps/loom_test-ed2e0ccf4660f873 --nocapture` (signal: 6, SIGABRT: process abort signal)
Pointerbender commented 1 year ago

@sporksmith Could you please try if the code from PR #297 fixes the issue you are describing?