cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.51k stars 3.7k forks source link

kv: investigate cost of increasing tsdb replication factor to 5 #125144

Open ajstorm opened 4 weeks ago

ajstorm commented 4 weeks ago

As observed on drt-large it can be confusing to users if they encounter unavailable ranges on a cluster which houses region survivable databases (which default to RF=5), when less than 3 nodes have failed. In the case uncovered on drt-large, the tsdb was responsible for the unavailable ranges, as it defaults to RF=3 (and doesn't have an accompanying zone config by default - https://github.com/cockroachdb/cockroach/issues/123762).

If users have created all of their databases to be region survivable (or have otherwise increased the replication factor of them from 3 to some number larger than 3), it's likely that they'll also want their tsdb to be able to survive more than one failed node. This issue aims to investigate the cost of running tsdb with RF=5, to determine if we should make that the new default.

Jira issue: CRDB-39275

blathers-crl[bot] commented 4 weeks ago

Hi @ajstorm, please add branch-* labels to identify which branch(es) this C-bug affects.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.