TraceMachina / nativelink

NativeLink is an open source high-performance build cache and remote execution server, compatible with Bazel, Buck2, Reclient, and other RBE-compatible build systems. It offers drastically faster builds, reduced test flakiness, and specialized hardware.
https://nativelink.com
Apache License 2.0
865 stars 92 forks source link

Panic during `running_actions_manager_tests::kill_all_waits_for_all_tasks_to_finish` #735

Open aaronmondal opened 4 months ago

aaronmondal commented 4 months ago

Reproducible by uncommenting

https://github.com/TraceMachina/nativelink/blob/35daf433f01150cdf3b5da4e9a97e561be03cbdf/flake.nix#L142-L149

and running nix flake check -L.

nativelink-nextest>      SIGABRT [   0.093s] nativelink-worker::running_actions_manager_test running_actions_manager_tests::kill_all_waits_for_all_tasks_to_finish
nativelink-nextest> --- STDOUT:              nativelink-worker::running_actions_manager_test running_actions_manager_tests::kill_all_waits_for_all_tasks_to_finish ---
nativelink-nextest> running 1 test
nativelink-nextest> 
nativelink-nextest> --- STDERR:              nativelink-worker::running_actions_manager_test running_actions_manager_tests::kill_all_waits_for_all_tasks_to_finish ---
nativelink-nextest> thread 'running_actions_manager_tests::kill_all_waits_for_all_tasks_to_finish' panicked at nativelink-worker/tests/running_actions_manager_test.rs:2534:13
:
nativelink-nextest> assertion failed: `(left == right)`
nativelink-nextest> Diff < left / right > :
nativelink-nextest> <true
nativelink-nextest> >false
nativelink-nextest> note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
nativelink-nextest> thread 'running_actions_manager_tests::kill_all_waits_for_all_tasks_to_finish' panicked at nativelink-worker/src/running_actions_manager.rs:1130:9:
nativelink-nextest> RunningActionImpl did not cleanup. This is a violation of how RunningActionImpl's requirements
nativelink-nextest> stack backtrace:
nativelink-nextest>    0:     0x7ffff7e58eb3 - std::backtrace_rs::backtrace::libunwind::trace::h7d9d6777fb8bb20d
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
nativelink-nextest>    1:     0x7ffff7e58eb3 - std::backtrace_rs::backtrace::trace_unsynchronized::h9c973399eff3243c
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
nativelink-nextest>    2:     0x7ffff7e58eb3 - std::sys_common::backtrace::_print_fmt::hcfcb0438b59f41ac
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:68:5
nativelink-nextest>    3:     0x7ffff7e58eb3 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h12ff509796e7db1f
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:44:22
nativelink-nextest>    4:     0x7ffff7b584f0 - core::fmt::rt::Argument::fmt::h0f89a5eb60e0da5a
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/rt.rs:142:9
nativelink-nextest>    5:     0x7ffff7b584f0 - core::fmt::write::hd2897b7ab92710d9
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/mod.rs:1120:17
nativelink-nextest>    6:     0x7ffff7e27172 - std::io::Write::write_fmt::h7aec362f631668c1
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/io/mod.rs:1810:15
nativelink-nextest>    7:     0x7ffff7e5c0de - std::sys_common::backtrace::_print::h02de5e3a7ec93331
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:47:5
nativelink-nextest>    8:     0x7ffff7e5c0de - std::sys_common::backtrace::print::h39dfbda16576800a
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:34:9
nativelink-nextest>    9:     0x7ffff7e5b860 - std::panicking::default_hook::{{closure}}::h9d3976e3cec30e20
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:272:22
nativelink-nextest>   10:     0x7ffff7e5b3d2 - std::panicking::default_hook::h8f0238c5452f23e8
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:292:9
nativelink-nextest>   11:     0x7ffff7e5c6b3 - std::panicking::rust_panic_with_hook::hc412fca92650d4b2
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:779:13
nativelink-nextest>   12:     0x7ffff7e5c3f8 - std::panicking::begin_panic_handler::{{closure}}::hf6e9952ead3e2c54
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:649:13
nativelink-nextest>   13:     0x7ffff7e5c386 - std::sys_common::backtrace::__rust_end_short_backtrace::h1410a9c85ac94b3e
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:171:18
nativelink-nextest>   14:     0x7ffff7e5c37f - rust_begin_unwind
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
nativelink-nextest>   15:     0x7ffff7ad33b4 - core::panicking::panic_fmt::h2838d0515af9b2ed
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
nativelink-nextest>   16:     0x7ffff7dabe3d - alloc::sync::Arc<T,A>::drop_slow::h29f2732f4b9ad7b1
nativelink-nextest>   17:     0x7ffff7b12432 - core::ptr::drop_in_place<futures_util::future::future::Then<futures_util::future::try_future::AndThen<futures_util::future::try
_future::AndThen<futures_util::future::try_future::AndThen<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future+Output = core::result::Result<alloc::sync::Arc<na
tivelink_worker::running_actions_manager::RunningActionImpl>,nativelink_error::Error>+core::marker::Send>>,core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future+O
utput = core::result::Result<alloc::sync::Arc<nativelink_worker::running_actions_manager::RunningActionImpl>,nativelink_error::Error>+core::marker::Send>>,<nativelink_worker:
:running_actions_manager::RunningActionImpl as nativelink_worker::running_actions_manager::RunningAction>::execute>,core::pin::Pin<alloc::boxed::Box<dyn core::future::future:
:Future+Output = core::result::Result<alloc::sync::Arc<nativelink_worker::running_actions_manager::RunningActionImpl>,nativelink_error::Error>+core::marker::Send>>,<nativelin
k_worker::running_actions_manager::RunningActionImpl as nativelink_worker::running_actions_manager::RunningAction>::upload_results>,core::pin::Pin<alloc::boxed::Box<dyn core:
:future::future::Future+Output = core::result::Result<nativelink_util::action_messages::ActionResult,nativelink_error::Error>+core::marker::Send>>,<nativelink_worker::running
_actions_manager::RunningActionImpl as nativelink_worker::running_actions_manager::RunningAction>::get_finished_result>,running_actions_manager_test::running_actions_manager_
tests::kill_all_waits_for_all_tasks_to_finish::{{closure}}::{{closure}}::{{closure}},running_actions_manager_test::running_actions_manager_tests::kill_all_waits_for_all_tasks
_to_finish::{{closure}}::{{closure}}>>::h22380ae78abaf3fa
nativelink-nextest>   18:     0x7ffff7b51ef1 - running_actions_manager_test::running_actions_manager_tests::kill_all_waits_for_all_tasks_to_finish::{{closure}}::h139256e7aba8
77d7
nativelink-nextest>   19:     0x7ffff7dae032 - tokio::runtime::runtime::Runtime::block_on::hf8de6271c7d2a0bf
nativelink-nextest>   20:     0x7ffff7d94771 - core::ops::function::FnOnce::call_once::hc7dd41fdd9cc0784
nativelink-nextest>   21:     0x7ffff7e7be2f - core::ops::function::FnOnce::call_once::h05908bac8b786408
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:250:5
nativelink-nextest>   22:     0x7ffff7e7be2f - test::__rust_begin_short_backtrace::h1891cb0d38cf136c
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/test/src/lib.rs:627:18
nativelink-nextest>   23:     0x7ffff7e78d85 - test::types::RunnableTest::run::h7b114a4eb0b22b57
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/test/src/types.rs:146:40
nativelink-nextest>   24:     0x7ffff7e78d85 - test::run_test_in_process::{{closure}}::h8ace4e30da085cca
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/test/src/lib.rs:650:60
nativelink-nextest>   25:     0x7ffff7e78d85 - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::hcb5c1bdc57450707
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panic/unwind_safe.rs:272:9
nativelink-nextest>   26:     0x7ffff7e78d85 - std::panicking::try::do_call::h2d5735aed97e1b89
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
nativelink-nextest>   27:     0x7ffff7e78d85 - std::panicking::try::ha880a382083c2920
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
nativelink-nextest>   28:     0x7ffff7e78d85 - std::panic::catch_unwind::h784fa2f848fe6f40
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
nativelink-nextest>   29:     0x7ffff7e78d85 - test::run_test_in_process::h103b2c3990253143
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/test/src/lib.rs:650:27
nativelink-nextest>   30:     0x7ffff7e78d85 - test::run_test::{{closure}}::h6a40236b92e11b77
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/test/src/lib.rs:573:43
nativelink-nextest>   31:     0x7ffff7e7c169 - test::run_test::{{closure}}::h17e776e993b8a8d0
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/test/src/lib.rs:601:41
nativelink-nextest>   32:     0x7ffff7e7c169 - std::sys_common::backtrace::__rust_begin_short_backtrace::hbda37206a6dd7cec
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:155:18
nativelink-nextest>   33:     0x7ffff7e7bfcb - std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}::h1c9ed572ea0727fb
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/mod.rs:529:17
nativelink-nextest>   34:     0x7ffff7e7bfcb - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::h3d4cbabe6b1ac0b0
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panic/unwind_safe.rs:272:9
nativelink-nextest>   35:     0x7ffff7e7bfcb - std::panicking::try::do_call::hd2546a679cd042c4
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
nativelink-nextest>   36:     0x7ffff7e7bfcb - std::panicking::try::h47d20f8f400cb55b
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
nativelink-nextest>   37:     0x7ffff7e7bfcb - std::panic::catch_unwind::h4dce4359d6dcde93
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
nativelink-nextest>   38:     0x7ffff7e7bfcb - std::thread::Builder::spawn_unchecked_::{{closure}}::h0c12cc76028f764b
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/mod.rs:528:30
nativelink-nextest>   39:     0x7ffff7e7bfcb - core::ops::function::FnOnce::call_once{{vtable.shim}}::h88e6859434dd2cc2
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:250:5
nativelink-nextest>   40:     0x7ffff7e5e5a5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h0c19c267a854c212
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2015:9
nativelink-nextest>   41:     0x7ffff7e5e5a5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc50bf08128f952fb
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2015:9
nativelink-nextest>   42:     0x7ffff7e5e5a5 - std::sys::unix::thread::Thread::new::thread_start::ha18af1a028ebd29e
nativelink-nextest>                                at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys/unix/thread.rs:108:17
nativelink-nextest> thread 'running_actions_manager_tests::kill_all_waits_for_all_tasks_to_finish' panicked at library/core/src/panicking.rs:163:5:
nativelink-nextest> panic in a destructor during cleanup
nativelink-nextest> thread caused non-unwinding panic. aborting.
allada commented 4 months ago

We can probably fix this by making it a spawn instead of just pulling the future.