Open runsisi opened 3 months ago
here is another panic:
https://github.com/superfly/corrosion/blob/main/crates/corro-agent/src/agent/util.rs#L999
for (_, changeset, _, _) in changesets.iter() {
if let Some(ts) = changeset.ts() {
let dur = (agent.clock().new_timestamp().get_time() - ts.0).to_duration(); // <--- panic here
histogram!("corro.agent.changes.commit.lag.seconds").record(dur);
}
}
backtrace:
thread 'tokio-runtime-worker' panicked at /home/runsisi/build/corrosion/uhlc/src/ntp64.rs:164:14:
attempt to subtract with overflow
stack backtrace:
0: rust_begin_unwind
at /rustc/90e321d82a0a9c3d0e3f180d4d17541b729072e0/library/std/src/panicking.rs:645:5
1: core::panicking::panic_fmt
at /rustc/90e321d82a0a9c3d0e3f180d4d17541b729072e0/library/core/src/panicking.rs:72:14
2: core::panicking::panic
at /rustc/90e321d82a0a9c3d0e3f180d4d17541b729072e0/library/core/src/panicking.rs:142:5
3: <uhlc::ntp64::NTP64 as core::ops::arith::Sub>::sub
at /home/runsisi/build/corrosion/uhlc/src/ntp64.rs:164:14
4: <&uhlc::ntp64::NTP64 as core::ops::arith::Sub<uhlc::ntp64::NTP64>>::sub
at /home/runsisi/build/corrosion/uhlc/src/ntp64.rs:173:9
5: corro_agent::agent::util::process_multiple_changes::{closure#0}::{closure#0}::{closure#0}::{closure#0}
at /home/runsisi/build/corrosion/crates/corro-agent/src/agent/util.rs:999:27
6: tokio::runtime::context::runtime_mt::exit_runtime::<corro_agent::agent::util::process_multiple_changes::{closure#0}::{closure#0}::{closure#0}::{closure#0}, core::result::Result<alloc::vec::Vec<(corro_types::actor::ActorId, corro_types::broadcast::Changeset, corro_base_types::CrsqlDbVersion, corro_types::broadcast::ChangeSource)>, corro_types::agent::ChangeError>>
at /root/.cargo/registry/src/mirrors.ustc.edu.cn-12df342d903acd47/tokio-1.34.0/src/runtime/context/runtime_mt.rs:35:5
7: tokio::runtime::scheduler::multi_thread::worker::block_in_place::<corro_agent::agent::util::process_multiple_changes::{closure#0}::{closure#0}::{closure#0}::{closure#0}, core::result::Result<alloc::vec::Vec<(corro_types::actor::ActorId, corro_types::broadcast::Changeset, corro_base_types::CrsqlDbVersion, corro_types::broadcast::ChangeSource)>, corro_types::agent::ChangeError>>
in debug build, run
corrosion exec
on node with faster clock cause other nodes panic.https://github.com/superfly/corrosion/blob/main/crates/corro-agent/src/agent/handlers.rs#L860.
since the
Broadcast
changeset sent from node with faster clock has newer timestamp than the receiver node, so subtract with overflow occurs.the panic backtrace: