Open fl0rek opened 1 month ago
Thanks for the report!
Full Backtrace
This is the backtrace for the poisoned mutex panic, not the original underflow panic that caused it. You'll want to look just above in the logs.
let diff = max_stream_data - self.sent_max_stream_data; seems to be the offending substraction
That's surprising. self.sent_max_stream_data
is only ever populated with a max_stream_data
returned from an immediately preceding call to fn max_stream_data
. That value is a sum of self.assembler.bytes_read()
, which grows monotonically, and stream_receive_window
, which never changes within the lifetime of a connection. An accounting error in bytes_read
isn't impossible, but I'm not seeing one.
Hey @Ralith, thanks for the response!
At the time I wasn't capturing quinn logs to save space and all I have there is occasional
WARN quinn_proto::connection: discarding possible duplicate packet
.
I am trying to reproduce the panic again with more logging enabled, but so far it seems like a very rare occurance and I did not manage to yet.
Panic backtraces aren't Quinn logs; they're produced by the panic handler, which by default looks for RUST_BACKTRACE=1
in your environment per your paste. Be sure to set that.
Might be worth investigating hooking a smart fuzzer up to the stream bookkeeping directly, if you've got no idea at all what triggered this.
I caught an 'attempt to substract with overflow' panic here https://github.com/quinn-rs/quinn/blob/db2df614fcab3e9c17b3e5f325eb197920489779/quinn-proto/src/connection/streams/recv.rs#L109 while doing some ridiculous tests with 30 local endpoints communicating with each other in debug build.
Unfortunately there are too many logs to attach, but apparently a lot of packets are lost and something goes beyond the window.
Did you get a backtrace?
We've encountered a panic in
quinn-proto-0.10.4
. We're usingquinn-0.10.2
withlibp2p-0.53.1
.Full Backtrace
``` Finished dev [unoptimized + debuginfo] target(s) in 1.89s Running `target/debug/celestia node` thread 'main' panicked at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/quinn-proto-0.10.4/src/connection/streams/recv.rs:108:20: attempt to subtract with overflow note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace thread 'main' panicked at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/quinn-0.10.2/src/mutex.rs:138:42: called `Result::unwrap()` on an `Err` value: PoisonError { .. } stack backtrace: 0: 0x102468cb4 - std::backtrace_rs::backtrace::libunwind::trace::hdc7d2d82dacc83e6 at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5 1: 0x102468cb4 - std::backtrace_rs::backtrace::trace_unsynchronized::h11fcbf27db856d40 at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5 2: 0x102468cb4 - std::sys_common::backtrace::_print_fmt::h117e5fb45bd92414 at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/sys_common/backtrace.rs:68:5 3: 0x102468cb4 -::run::he64183064a74982b at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/mod.rs:408:9 43: 0x100d0006c - tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}::h4f54fd7046d30818 at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:706:25 44: 0x100cff5c4 - tokio::runtime::coop::with_budget::hd1df283faa20597e at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/coop.rs:107:5 45: 0x100cff5c4 - tokio::runtime::coop::budget::hdbc87cf7abae73a3 at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/coop.rs:73:5 46: 0x100cff5c4 - tokio::runtime::scheduler::current_thread::Context::run_task::{{closure}}::h4b04702e2166dc0f at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:343:43 47: 0x100cff2c8 - tokio::runtime::scheduler::current_thread::Context::enter::h5b01fcd5f4339216 at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:410:19 48: 0x100cff468 - tokio::runtime::scheduler::current_thread::Context::run_task::h5b8e74259132849e at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:343:23 49: 0x100cfffe0 - tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::ha0c20fcef49684df at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:705:34 50: 0x100cff970 - tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}}::h9701170602fd9494 at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:743:68 51: 0x100c418ac - tokio::runtime::context::scoped::Scoped::set::h66e055fe3d61cb8f
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/context/scoped.rs:40:9
52: 0x10106dd48 - tokio::runtime::context::set_scheduler::{{closure}}::h363ffc220f660c3b
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/context.rs:176:26
53: 0x100b73b78 - std::thread::local::LocalKey::try_with::hb1ef866639d0dc22
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/thread/local.rs:286:16
54: 0x100b6eab8 - std::thread::local::LocalKey::with::ha8b2f4275ead3c72
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/thread/local.rs:262:9
55: 0x10106dd08 - tokio::runtime::context::set_scheduler::h92feef536709b8dd
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/context.rs:176:9
56: 0x100cff798 - tokio::runtime::scheduler::current_thread::CoreGuard::enter::h6d792852750080b8
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:743:27
57: 0x100cff9a0 - tokio::runtime::scheduler::current_thread::CoreGuard::block_on::hc5b2a62620625faa
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:652:19
58: 0x100cf67d4 - tokio::runtime::scheduler::current_thread::CurrentThread::block_on::{{closure}}::h7b3eb93862647b6c
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:175:28
59: 0x100c0e464 - tokio::runtime::context::runtime::enter_runtime::hb5570fc4746770ed
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/context/runtime.rs:65:16
60: 0x100cf66f4 - tokio::runtime::scheduler::current_thread::CurrentThread::block_on::h3910abd9a4b6dd10
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/scheduler/current_thread/mod.rs:167:9
61: 0x100c98d94 - tokio::runtime::runtime::Runtime::block_on::h476c80be26a5a58d
at /Users/mikolaj.florkiewicz/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/runtime.rs:348:47
62: 0x100f3a2a4 - celestia::main::h5794a3ede700ae30
at /Users/mikolaj.florkiewicz/dev/lumina-header_sync_repro/celestia/src/main.rs:4:5
63: 0x1010e50b4 - core::ops::function::FnOnce::call_once::hc4b6dc09ac572737
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/ops/function.rs:250:5
64: 0x1010720a4 - std::sys_common::backtrace::__rust_begin_short_backtrace::h93a345640907338a
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/sys_common/backtrace.rs:155:18
65: 0x100d7509c - std::rt::lang_start::{{closure}}::h6c6905c3eeb8cbbc
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/rt.rs:166:18
66: 0x102460254 - core::ops::function::impls:: for &F>::call_once::hc7fa490c25ff5c7e
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/ops/function.rs:284:13
67: 0x102460254 - std::panicking::try::do_call::h357310772e5ed201
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panicking.rs:554:40
68: 0x102460254 - std::panicking::try::h7be07bf7667211c0
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panicking.rs:518:19
69: 0x102460254 - std::panic::catch_unwind::h3ab45e592d6c9793
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panic.rs:142:14
70: 0x102460254 - std::rt::lang_start_internal::{{closure}}::h522f1a8ded7e5ad3
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/rt.rs:148:48
71: 0x102460254 - std::panicking::try::do_call::h403386826d59f38d
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panicking.rs:554:40
72: 0x102460254 - std::panicking::try::h5e255b7a06373cce
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panicking.rs:518:19
73: 0x102460254 - std::panic::catch_unwind::h462489ab339aac24
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panic.rs:142:14
74: 0x102460254 - std::rt::lang_start_internal::h7f00bef3cf1b3caa
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/rt.rs:148:20
75: 0x100d75068 - std::rt::lang_start::h7323fd224ebf8c76
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/rt.rs:165:17
76: 0x100f3a340 - _main
thread 'main' panicked at library/core/src/panicking.rs:163:5:
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.
[1] 96704 abort cargo run node
```
I don't think I can provide much details on reproduction. Code was running on the network for ~5h while I was trying to reproduce another bug when I saw this panic.
let diff = max_stream_data - self.sent_max_stream_data;
seems to be the offending substraction, but I don't know quinn well enought to tell whether that's a cause (to be fixed with e.g. saturating substraction) or a symptom of a deeper issue.