Open nicktrav opened 1 month ago
In a similar vein, I had looked at a few issues where ranges were a little more than under-replicated: they were unavailable because no quorum could be formed for appending log entries, i.e. only a minority was up to date on the log. Often this involved a few down nodes, and one of the replicas in the remaining required quorum then either not receiving a snapshot that it needed or just behind behind on the log for opaque reasons.
Adapted from CRDB-40230.
Currently, our logs will only indicate there are under-replicated ranges, but they don't say why these ranges are under-replicated.
The why requires knowledge and analysis of log files (or files in a debug.zip). These are often difficult to parse.
Improve our logging and how we display and report on under-replicated ranges in the DB console.
Jira issue: CRDB-41374