hashicorp / raft-autopilot

Raft Autopilot
Mozilla Public License 2.0
21 stars 5 forks source link

Dead server cleanup should take into account potential future voters and min quorum #17

Closed ncabatoff closed 1 year ago

ncabatoff commented 2 years ago

Autopilot stabilization ensures that new nodes, even ones that are destined to become voters, always start as non-voters until they've been seen to stay current and in contact for the stabilization period.

Autopilot dead server pruning only respects min_quorum for voter nodes. This means that a new voting node that's starting up, which hasn't yet been deemed stable and promoted to a voter, can be pruned by autopilot before it gets a chance to stabilize. This is especially a concern in Vault, as we determine a node to be dead if it isn't sending us heartbeats (not raft heartbeats, a different Vault-specific kind), but those heartbeats won't happen on a newly joined node that's still applying the initial snapshot, because the address to send the heartbeats to is recorded in storage, i.e. lies within that snapshot.

In discussing this with @mkeeler , he proposed that min quorum should prevent removal of too many non-voters when autopilot knows that we desire them to be voters. So persistent non-voters (read replicas) can always be pruned. Outside of that non-voters can be pruned so long as the remaining set of potential voters would be able to satisfy the min quorum constraints.

weichuliu commented 2 years ago

It really doesn't make much sense to me that new node has to read/decrypt the whole snapshot to get the raft configuration.

Is there anyway to directly query the quorum and get raft leader without downloading the snapshot?

raskchanky commented 1 year ago

I think https://github.com/hashicorp/raft-autopilot/pull/23 should've addressed this issue, so I'm going to close this. Feel free to reopen if I'm mistaken.