Closed saliceti closed 7 years ago
Why not just use a TCP health check vs this NOOP HTTP health check?
If we are going to add a health check endpoint, I think it should attempt to exercise some functionality to ensure it is actually healthy / configured correctly.
We are planning to implement more checks because the broker has many dependencies: postgres, letsencrypt, cloudcontroller, Cloudfront, S3.
This endpoint is a starting point. It doesn't do much, but already does more than just TCP. It confirms the broker can actually talk HTTP.
Hi, we are looking at improving this healthcheck as part of our current sprint - so far we are planning on doing the following -
We can roughly test the LE server by hitting the ACME directory URL: https://github.com/xenolf/lego/blob/master/acme/client.go#L81
We can test cloudfront using the ListDistributions call https://github.com/aws/aws-sdk-go/blob/master/service/cloudfront/api.go#L1905
We can test the RDS instance by connecting to it
We can test the cf client using the ListDomains call https://github.com/cloudfoundry-community/go-cfclient/blob/3805b12f648c81339bca0df4124800e7c9575865/domains.go#L87
We can test the S3 bucket by writing an object to it, reading it back, then deleting it
Is there anything else you think we should be testing as part of this? If we implemented the above, would you be open to merging a PR?
It's probably not a good idea to mark the service broker down from an LB point of view if LetsEncrypt is down. The rest of the checks SGTM!
@cnelson we've done the work to introduce the additional healthcheck endpoints. They are:
The level of testing varies based on what is feasible in a quick check - for example it's not reasonable to check that Cloudfront can provision a distribution.
The HTTP-only test has moved to /healthchecks/http
as /healthchecks
now runs all available healthchecks. We're using /healthcheck/http
from the load balancer and /healthcheck
from our monitoring system.
What
Add a simple healthcheck endpoint. We deploy the CDN broker on its own VM and access it via an ELB. This endpoint is used by the ELB to check if the app is running. It should not be protected by the basic auth.
Also, the cfclient library is updated to avoid annoying crashes.
How to review
cd cdm/cdn-broker && go test
/healthcheck
, it should returnHTTP 200 OK