Open vitabaks opened 1 year ago
Thank you for adding this to the enhancement list!
I am not sure if this would help, but I believe I was able to replace a dead etcd node by doing the following:
In the inventory file:
mark the bad node with new_node
[etcd_cluster] # recommendation: 3, or 5-7 nodes
10.10.10.77
10.10.10.78
10.10.10.79 new_node=true
on any non dead node remove existing node like so:
export ETCDCTL_API=3
HOST_1=10.10.10.77
HOST_2=10.10.10.78
HOST_3=10.10.10.79
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
Get node dead node id
etcdctl --endpoints=$ENDPOINTS member list
etcdctl --endpoints=$ENDPOINTS endpoint health
etcdctl --write-out=table --endpoints=$ENDPOINTS endpoint status
Remove dead node
etcdctl member remove {NODE ID HERE}
readd new node
etcdctl member add ffs-node03 --peer-urls=http://10.10.10.79:2380
modify etcd.conf template
ETCD_INITIAL_CLUSTER_STATE="{{ 'existing' if new_node | default(false) | bool else 'new' }}"
Then rerun etcd playbook like this
ansible-playbook etcd_cluster.yml -i environments/staging/inventory --extra-vars "@environments/staging/main.yml"
@m3ki Thank you for your comment. I think some of the examples you provided will be used as a basis for further automation of the etcd cluster management process.
Now I had to modify ansible etcd playbook to change etcd user directory to something else of that of etcd data directory like so add/modify following after etcd data directory
Please tell me why it was necessary to do this.
@m3ki Thank you for your comment. I think some of the examples you provided will be used as a basis for further automation of the etcd cluster management process.
Now I had to modify ansible etcd playbook to change etcd user directory to something else of that of etcd data directory like so add/modify following after etcd data directory
Please tell me why it was necessary to do this.
I had an issue starting the etcdserver on the new node, complaining that the directory already had files in it ie. .bashrc .profile etc... very odd since etcd is not a login user, and I haven't logged in into that user either. I'll test some more and report back.
@vitabaks disregard changing of the home directory it seems to work fine. Just retested on my test cluster! I updated my comment above too.
Currently, scaling of Postgres nodes using playbook
add_pgnode.yml
and scaling of HAProxy nodes using playbookadd_balancer.yml
is implemented, but for etcd or consul cluster, only the initial deploument is automated and further maintenance needs to be done manually, for example, replacing a failed node or scaling a DCS cluster.Automate the scaling of DCS cluster nodes
Consider sponsoring the project via GitHub or Patreon