hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.31k stars 4.42k forks source link

Health HTTP endpoint for check alone #4636

Open koalalam opened 6 years ago

koalalam commented 6 years ago

Feature Description

A HTTP API that will return 4xx status code when a serfHealth of a target node fails.

Use Case(s)

The current /health/node/:node API will always return 200 even though the provided node name/id is invalid or status of 'serfHealth' isn't passing. We need this new API to configure our Consul Server DNS name.

pierresouchay commented 6 years ago

Changing HTTP status will break the API, why do you need HTTP 4xx for that?

Having [] probably means your node has no healt check at all, including no serf health If there are values, it means the node does exists, you just have to check the status of the checks to decide what to do.

We are doing exactly this in https://github.com/criteo/consul-templaterb/tree/master/samples/consul-ui in order to interpret whether the node is healthy or not.

If you are looking at HTTP compatibility with healthchecks, we did implement this as well: https://github.com/hashicorp/consul/pull/3551 => allow to check connectivity with agent as well as having standard HTTP code for healthiness , we use this as healthchecks for our load-balancer stack, but this is another topic (this is service centric, not node centric)

koalalam commented 6 years ago

Because at the DNS layer, it doesn't need to interpret the result in order to tell if the target service/endpoint is healthy or not. Don't have to be 4xx, but any non-200 (OK) status code would work. It is not uncommon to have such implementation. For example, Dropwizard (microservice framework) uses 500.

https://www.dropwizard.io/1.3.5/docs/manual/core.html#health-checks

pearkes commented 6 years ago

@koalalam I think you're asking for essentially a new API endpoint. The health catalog API request should return the status of the request if it succeeded in displaying the information requested (node health) and not propagate the underlying health. As you probably know we do have endpoints that do have this behavior but they are explicitly documented as such.

vodrazka commented 3 years ago

@koalalam did you manage to solve this somehow?

I'm trying to check the health of a specific consul node from HAProxy...