hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.25k stars 4.41k forks source link

RFE: Have /v1/agent/maintenance respond to GET with valid response codes #2779

Open jasonmcintosh opened 7 years ago

jasonmcintosh commented 7 years ago

Ideally, there'd be a simple way to tell the state of a node via HTTP Response Codes. Even better would be via a HEAD vs. GET request, so that things like Load balancers can auto remove nodes based upon failures. Specifically /v1/agent/maintenance GET/HEAD - 503 when in maintenance mode /v1/agent/maintenance GET/HEAD - 200 when not in maintenance mode

Another end point that should return a 503 response code as well: /v1/agent/checks

It'd be nice if /v1/agent/checks also responded to a HEAD request with the response code for simple health checking/monitoring

slackpad commented 7 years ago

Hi @jasonmcintosh thanks for opening an issue. Usually things like load balancers will work off of endpoints like https://www.consul.io/api/health.html#list-nodes-for-service. Are you running something that sits alongside the Consul agent on each node on your cluster?

jasonmcintosh commented 7 years ago

The nodes for service is ALL nodes, not the specific node you're communicating with. E.g. if you tried from a command line to tell if a current system is in a healthy state, or wanted to monitor the state of an agent via a remote zabbix call, there's no good end point to do so right now. Standard handling on that (e.g. Spring-cloud's health check) changes the http response code, and marks a status down on /health

I'd be fine with a completely new end point if that's viable... e.g. /v1/agent/health That does the work on figuring out the node name, verifying the status checks and returns standard http codes.