Open andrewbaptist opened 2 hours ago
Describe the problem
While running the backfill test, the backfill hung and nodes became stuck waiting for a snapshot.
To Reproduce
Run the following test:
PERTURBATION_OVERRIDE=acMode=fullBoth roachtest run perturbation/full/backfill
Additionally this test reproduces the issue as well: https://github.com/cockroachdb/cockroach/pull/135339
Additional data / screenshots
The error in the logs is:
E241115 22:00:40.609821 23415709 kv/kvserver/queue.go:1198 ⋮ [T1,Vsystem,n3,raftsnapshot,s6,r7801/4:‹/Table/109/1/-781{9715…-7870…}›] 535505 error sending couldn't accept ‹range_id:7801 coordinator_replica:<node_id:3 store_id:6 replica_id:4 type:VOTER_FULL > recipient_replica:<node_id:12 store_id:24 replica_id:1 type:VOTER_FULL > delegated_sender:<node_id:3 store_id:6 replica_id:4 type:VOTER_FULL > term:7 first_index:11993 sender_queue_name:RAFT_SNAPSHOT_QUEUE descriptor_generation:95 queue_on_delegate_len:-1 snap_id:9e4c2549-8a9c-4d99-8d92-99594f668bd8 ›: (n12,s24):1: remote couldn't accept snapshot 9e4c2549 at applied index 11993: ‹snapshot intersects existing range; initiated GC:› [n12,s24,r7924/4:‹/Table/109/1/-78{2340…-1418…}›] (incoming ‹/Table/109/1/-781{9715178531312532-7870688572937416}›)
This repeats at a high rate (~100/s)
Cluster link
Jira issue: CRDB-44457
Hi @andrewbaptist, please add branch-* labels to identify which branch(es) this C-bug affects.
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
Describe the problem
While running the backfill test, the backfill hung and nodes became stuck waiting for a snapshot.
To Reproduce
Run the following test:
Additionally this test reproduces the issue as well: https://github.com/cockroachdb/cockroach/pull/135339
Additional data / screenshots
The error in the logs is:
This repeats at a high rate (~100/s)
Cluster link
Jira issue: CRDB-44457