cloud-gov / cf-cdn-service-broker

A Cloud Foundry service broker for CloudFront and Let's Encrypt
Other
10 stars 13 forks source link

Add healthcheck endpoint #81

Closed saliceti closed 7 years ago

saliceti commented 7 years ago

What

Add a simple healthcheck endpoint. We deploy the CDN broker on its own VM and access it via an ELB. This endpoint is used by the ELB to check if the app is running. It should not be protected by the basic auth.

Also, the cfclient library is updated to avoid annoying crashes.

How to review

cnelson commented 7 years ago

Why not just use a TCP health check vs this NOOP HTTP health check?

If we are going to add a health check endpoint, I think it should attempt to exercise some functionality to ensure it is actually healthy / configured correctly.

saliceti commented 7 years ago

We are planning to implement more checks because the broker has many dependencies: postgres, letsencrypt, cloudcontroller, Cloudfront, S3.

This endpoint is a starting point. It doesn't do much, but already does more than just TCP. It confirms the broker can actually talk HTTP.

timmow commented 7 years ago

Hi, we are looking at improving this healthcheck as part of our current sprint - so far we are planning on doing the following -

We can roughly test the LE server by hitting the ACME directory URL: https://github.com/xenolf/lego/blob/master/acme/client.go#L81

We can test cloudfront using the ListDistributions call https://github.com/aws/aws-sdk-go/blob/master/service/cloudfront/api.go#L1905

We can test the RDS instance by connecting to it

We can test the cf client using the ListDomains call https://github.com/cloudfoundry-community/go-cfclient/blob/3805b12f648c81339bca0df4124800e7c9575865/domains.go#L87

We can test the S3 bucket by writing an object to it, reading it back, then deleting it

Is there anything else you think we should be testing as part of this? If we implemented the above, would you be open to merging a PR?

cnelson commented 7 years ago

It's probably not a good idea to mark the service broker down from an LB point of view if LetsEncrypt is down. The rest of the checks SGTM!

dcarley commented 7 years ago

@cnelson we've done the work to introduce the additional healthcheck endpoints. They are:

The level of testing varies based on what is feasible in a quick check - for example it's not reasonable to check that Cloudfront can provision a distribution.

The HTTP-only test has moved to /healthchecks/http as /healthchecks now runs all available healthchecks. We're using /healthcheck/http from the load balancer and /healthcheck from our monitoring system.