Logging on healthcheck fail

Ortus-Solutions / docker-commandbox

Official CommandBox Docker Image for ColdFusion/CFML/Java applications

62 stars 41 forks source link

healtcheck URL: http://localhost/DB_Check.cfm URL FAILED, returning [500 Internal Server Error] after 12 seconds. Error Response: Database connection failed or whatever HTML came back, possibly truncated to a few thousand chars.

I'm circling back to this today before I bump a release. The default healthcheck does capture the output of the curl --fail ${HEALTHCHECK_URI} but with a Killed status of a container, that could also be an OOM kill - usually when the container exceeds either the total host resources or exceeds any resource limits specified in a stack file or other config.

Below is an example of the formatted docker inspect on a failed container "Health" section. Note that this container started failing with timeouts to the URL on port 8080. As such, you only see the cURL progress output. On a successful healthcheck you would see the actual output and on a 500 error, you would see the error HTML output.

While it is possible implement some sort of custom healthcheck.sh that looks at the status code and the pattern response, the downside of that is, on a timeout of the healthcheck ( default 30s ), you would get no output at all, because the custom script to parse would swallow any cURL output. As such, I think I'm going hold off on trying that for now.

Going forward, if you need to debug a failed container, try using JQ to pull only the health section back:

docker inspect --format "{{json .State }}" <container name> | jq

That should give you information on exactly where the container failed. I'm also going to update the docs with this information. Screenshot 2024-04-03 at 2 20 01 PM

Ortus-Solutions / docker-commandbox

Logging on healthcheck fail #76