Open aliher1911 opened 1 year ago
Test that exposed the issue: https://github.com/cockroachdb/cockroach/pull/99020
Hi @kvoli, please add branch-* labels to identify which branch(es) this C-bug affects.
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
Dead and live nodes should be treated equally by allocator when they are marked as decommissioning. Currently allocator would only start moving voters from decommissioning node if it is live and ignore it if it is livenesspb.NodeLivenessStatus_UNAVAILABLE.
This behaviour will cause drain to wait for
server.time_until_store_dead
(5 minutes) till node is declared dead and that would trigger dead node rule to move voters away from it.Ideally allocator should drain everything regardless of liveness, but internally both liveness and decommission states are represented by a single enum and liveness needs to take precedence so it masks decommissioning state. Handling states separately would allow allocator behaviour to be more consistent.
Jira issue: CRDB-25686