We often see frequent consistency checker log messages in production clusters, thousands of these in the span of a day:
E230810 10:11:48.452144 1031 kv/kvserver/pkg/kv/kvserver/replica_consistency.go:860 ⋮ [T1,n137,s137,r103516/33:‹x›,raft] 72 could not run async checksum computation (ID = ‹x›): throttled on async limiting semaphore
E230810 10:11:58.275594 3380 kv/kvserver/replica_application_result.go:343 ⋮ [T1,n130,s130,r214910/36:‹x›,raft] 14606 failed to start ComputeChecksum task ‹x›: throttled on async limiting semaphore
E230810 10:12:19.918627 876259 kv/kvserver/replica_consistency.go:838 ⋮ [T1,n139,s139,r489793/3:‹x›] 1549 checksum computation failed: context canceled
These errors are benign, noisy, and unhelpful. We should probably just silence these errors, and/or make sure we don't hit context cancellation and throttling.
We often see frequent consistency checker log messages in production clusters, thousands of these in the span of a day:
These errors are benign, noisy, and unhelpful. We should probably just silence these errors, and/or make sure we don't hit context cancellation and throttling.
Jira issue: CRDB-30532
Epic CRDB-39898