platform9 / cctl

Apache License 2.0
47 stars 8 forks source link

Recover etcd cluster #84

Closed dlipovetsky closed 6 years ago

dlipovetsky commented 6 years ago

Master machines run an etcd peer and an instance of the control plane. There are two scenarios where etcd quorum is lost:

  1. One or more master machines have not permanently failed, but their etcd peers can make no progress. To recover, an etcd snapshot should be obtained from one of the peers and a new etcd cluster initialize from the snapshot.

  2. All master machines have permanently failed. To recover, an etcd snapshot should to be obtained from a backup, new master machines created, with the etcd cluster initialized from the snapshot.

dlipovetsky commented 6 years ago

Implemented in #85