A library that enables async dependency health checking for services running on an orchestrated container platform such as kubernetes or mesos.
Container orchestration platforms require that the underlying service(s) expose a "health check" which is used by the platform to determine whether the container is in a good or bad state.
While this can be achieved by simply exposing a /status
endpoint that performs synchronous checks against its dependencies (followed by returning a 200
or non-200
status code), it is not optimal for a number of reasons:
30s+
to complete./status
endpoints checked by the orchestration platform as soon as they come up. Depending on the complexity of the checks, running that many simultaneous checks against your dependencies could cause at worst the dependencies to experience problems and at minimum add unnecessary load./status
endpoint and trigger unnecessary dep checks./status
endpoint could choke your dependencies (and potentially your service).With that said, not everyone needs asynchronous checks. If your service has one dependency (and that is unlikely to change), it is trivial to write a basic, synchronous check and it will probably suffice.
However, if you anticipate that your service will have several dependencies, with varying degrees of complexity for determining their health state - you should probably think about introducing asynchronous health checks.
Writing an async health checking framework for your service is not a trivial task, especially if Go is not your primary language.
This library:
/status
endpoint. [2]Redis
, Mongo
, HTTP
and more.IStatusListener
interface.OnComplete
hook.[1] Make sure to run your checks on a "sane" interval - ie. if you are checking your Redis dependency once every five minutes, your service is essentially running blind for about 4.59/5 minutes. Unless you have a really good reason, check your dependencies every X seconds, rather than X minutes.
[2] go-health
continuously writes dependency health state data and allows
you to query that data via .State()
. Alternatively, you can use one of the
pre-built HTTP handlers for your /healthcheck
endpoint (and thus not have to
manually inspect the state data).
For full examples, look through the examples dir
health
and configure a checker (or two)import (
health "github.com/InVisionApp/go-health/v2"
"github.com/InVisionApp/go-health/v2/checkers"
"github.com/InVisionApp/go-health/v2/handlers"
)
// Create a new health instance
h := health.New()
// Create a checker
myURL, _ := url.Parse("https://google.com")
myCheck, _ := checkers.NewHTTP(&checkers.HTTPConfig{
URL: myURL,
})
health
instanceh.AddChecks([]*health.Config{
{
Name: "my-check",
Checker: myCheck,
Interval: time.Duration(2) * time.Second,
Fatal: true,
},
})
h.Start()
From here on, you can either configure an endpoint such as /healthcheck
to use a built-in handler such as handlers.NewJSONHandlerFunc()
or get the current health state of all your deps by traversing the data returned by h.State()
.
Assuming you have configured go-health
with two HTTP
checkers, your /healthcheck
output would look something like this:
{
"details": {
"bad-check": {
"name": "bad-check",
"status": "failed",
"error": "Ran into error while performing 'GET' request: Get google.com: unsupported protocol scheme \"\"",
"check_time": "2017-12-30T16:20:13.732240871-08:00"
},
"good-check": {
"name": "good-check",
"status": "ok",
"check_time": "2017-12-30T16:20:13.80109931-08:00"
}
},
"status": "ok"
}
At first glance it may seem that these two features provide the same functionality. However, they are meant for two different use cases:
The IStatusListener
is useful when you want to run a custom function in the event that the overall status of your health checks change. I.E. if go-health
is currently checking the health for two different dependencies A and B, you may want to trip a circuit breaker for A and/or B. You could also put your service in a state where it will notify callers that it is not currently operating correctly. The opposite can be done when your service recovers.
The OnComplete
hook is called whenever a health check for an individual dependency is complete. This means that the function you register with the hook gets called every single time go-health
completes the check. It's completely possible to register different functions with each configured health check or not to hook into the completion of certain health checks entirely. For instance, this can be useful if you want to perform cleanup after a complex health check or if you want to send metrics to your APM software when a health check completes. It is important to keep in mind that this hook effectively gets called on roughly the same interval you define for the health check.
All PR's are welcome, as long as they are well tested. Follow the typical fork->branch->pr flow.