zalando-stups / stups-etcd-cluster

Etcd cluster appliance for the STUPS (AWS) environment
Other
29 stars 9 forks source link

Add new members to an existing cluster one by one. #2

Closed CyberDem0n closed 8 years ago

CyberDem0n commented 8 years ago

Before sending add member command to an existing cluster we should check that there is nobody already in process of adding itself to the cluster.

Basically this is just workaround for the following problem: https://github.com/zalando/stups-etcd-cluster/issues/1

New members were added successfully but etcd failed to start due to version incompatibility and we lost quorum.

feikesteenbergen commented 8 years ago

Manual regression test, to test failure scenario Network Partition:

3 node cluster, status:

cluster is healthy
member af90e28ea0bc9d03 is healthy
member e32468c722a28bd4 is healthy
member ff4e0f5a2f7eb641 is unhealthy

Add new node, by bumping Auto Scaling Group from 3 to 4:

The cluster adds a member, and becomes unhealthy for a short while, after that, the cluster is healthy again:

cluster is healthyh
member 231a86654ad1f632 is healthy
member af90e28ea0bc9d03 is healthy
member e32468c722a28bd4 is healthy
member ff4e0f5a2f7eb641 is unhealthy

My current hypothesis of this happening is:

feikesteenbergen commented 8 years ago

Tested again, by doing the following:

feikesteenbergen commented 8 years ago

+1