Closed coryflucas closed 5 years ago
Good catch! Sounds reasonable.
I tested this by spinning up a cluster with the new code and confirmed that hitting the agent/leave
endpoint results in the target host no longer showing as a member of the cluster for the other agents, and that trying to hit the API again on the target host results in a connection refused:
[ec2-user@ip-10-1-1-250 ~]$ TARGET=10.1.12.161
[ec2-user@ip-10-1-1-250 ~]$ OTHER=10.1.10.62
[ec2-user@ip-10-1-1-250 ~]$ curl -w "\n" http://$OTHER:8500/v1/status/peers
["10.1.12.161:8300","10.1.11.254:8300","10.1.10.62:8300"]
[ec2-user@ip-10-1-1-250 ~]$ curl -w "\n" http://$TARGET:8500/v1/status/peers
["10.1.12.161:8300","10.1.11.254:8300","10.1.10.62:8300"]
[ec2-user@ip-10-1-1-250 ~]$ curl -X PUT http://$TARGET:8500/v1/agent/leave
[ec2-user@ip-10-1-1-250 ~]$ curl -w "\n" http://$OTHER:8500/v1/status/peers
["10.1.11.254:8300","10.1.10.62:8300"]
[ec2-user@ip-10-1-1-250 ~]$ curl http://$TARGET:8500/v1/status/peers
curl: (7) Failed to connect to 10.1.12.161 port 8500: Connection refused
Prior to this change, the last two commands would show the target host had re-joined the cluster and restarted the API.
Great, thanks! I'll merge this now and let the tests run. If they pass, I'll create a new release and share the link.
Thanks! And thanks for putting this project together!
@Etiene The tests failed on this PR, but I just noticed they have been failing since #177 was merged. Could you look into it?
Fixes #108. This allows removing servers gracefully via the
/agent/leave
API endpoint for upgrades as suggested in the documentation here: https://github.com/hashicorp/terraform-aws-consul/tree/master/modules/consul-cluster#how-do-you-roll-out-updatesCurrently invoking the leave endpoint causes consul to stop and then supervisor immediately restarts it and it will rejoin the cluster. This makes performing a graceful shutdown impossible without shell access to the host to stop supervisor directly.