jfray / drip

digitalocean drip drop
0 stars 0 forks source link

Delete carefully to avoid wrecking etcd cluster state #10

Closed jfray closed 9 years ago

jfray commented 9 years ago

From the etcd admin guide:

etcd is designed to be resilient to machine failures. An etcd cluster can automatically recover from any > number of temporary failures (for example, machine reboots), and a cluster of N members can tolerate > up to (N/2)-1 permanent failures (where a member can no longer access the cluster, due to hardware failure or disk corruption).

During delete, it makes sense to use the (n/2)-1 permfail 1 heuristic to determine whether or not it can proceed. If it can't, maybe prompt for cluster backup and "TURN YOUR KEY SIR" 2 type of prompt. I won't go after that in this changeset though - will just fail out if someone tries to kill the damned thing.

1
2