Open xemul opened 6 days ago
several nodes have this crash in their logs
service::topology_coordinator::handle_topology_transition(service::group0_guard)::{lambda()#1}::operator()() const at ././service/topology_coordinator.cc:1872
Nothing unexpected here:
utils::get_local_injector().inject("crash_coordinator_before_stream", [] { abort(); });
test_kill_coordinator_during_op.1.debug.1.zip
node-4165 was trying to restart
It began restarting at 00:24:27
:
WARN 2024-10-15 00:24:27,745 seastar - Seastar compiled with default allocator, --memory option won't take effect
last log message was at 00:24:40
:
INFO 2024-10-15 00:24:40,859 [shard 0:comp] compaction - [Compact system.peers b91439b0-8a72-11ef-834b-50ec3982def2] Compacted 4 sstables to [/scylladir/testlog/x86_64/debug/scylla-4165/data/system/peers-37f71aca7dc2383ba70672528af04d4f/me-3gkd_1nh4_4p3yo1zvma01fxspv6-big-Data.db:level=0]. 38kB to 9489 bytes (~24% of original) in 68ms = 562kB/s. ~512 total partitions merged to 5.
so it got stuck 13
seconds into the restart procedure.
Last message from storage_service
is:
INFO 2024-10-15 00:24:38,026 [shard 0:strm] storage_service - The node is already in group 0 and will restart in raft mode
less than a second earlier it started reloading topology state:
DEBUG 2024-10-15 00:24:37,989 [shard 0: gms] raft_topology - reload raft topology state
it's unclear if this operation finished; did it get stuck here?
https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/12400/
several nodes have this crash in their logs
decoded