Closed wallrj closed 4 years ago
The E2E test failure in https://app.circleci.com/jobs/github/improbable-eng/etcd-cluster-operator/492/parallel-runs/0/steps/0-104 demonstrates that the etcd service is sometimes quite badly disrupted by the scale-down operations. I think because of 1 and sometimes 2 leader elections.
We could try and select non-leader members for removal but this would then often leave the member names out of sequence. Another good reason not to use ordinal names, perhaps?
But I also haven't had time to check whether there would be a leader election anyway. Perhaps leader elections happen as a result of the any change in cluster membership. In which case I can remove (or adjust ) the E2E cluster availability test.
https://github.com/improbable-eng/etcd-cluster-operator/issues/101 Follow up issue for safe deletion of PVCs during scale-down.
Part of #35