opensafely-core / opencodelists

OpenCodelists is an open platform for creating and sharing codelists of clinical terms and drugs.
https://www.opencodelists.org
Other
31 stars 11 forks source link

Production container health check run when deploying upcoming container #2156

Open iaindillingham opened 2 days ago

iaindillingham commented 2 days ago

Context

After a new Docker image is deployed to the Dokku app, a new container, which is called the upcoming container, is started from the new Docker image. A health check is run against the upcoming container. If the health check succeeds, then the current container is replaced with the upcoming container. If the health check fails, then the current container is not replaced.

Problem

The health check is configured in app.json. It's a Web check, which according to the "Zero Downtime Deploy Checks" page means that a curl command is constructed from the information in app.json by docker-container-healthchecker. We know from experience that the target of the curl command is an internal IP address, bypassing the DNS and the reverse proxy. The curl command is executed from the Dokku host.

Changes to app.json should be applied on the next deployment. However, we know from experience that changes to app.json are applied on the deployment after the next deployment. This would suggest that the curl command is constructed from app.json on the current container, rather than app.json on the upcoming container.

More importantly, it's not clear whether the target of the curl command is the internal IP address of the upcoming container or the internal IP address of the current container.

Solution

We captured the logs (STDOUT and STDERR) from the current container whilst a new Docker image was being deployed.[^1] We also captured the logs from the health check. The latter weren't a subset of the former; that is, the logs from the health check did not appear in the logs from the current container. Furthermore, the logs from the current container did not contain any requests that appeared to be made by the health check.

We captured the logs from the current container whilst the health check was executed from the Dokku host (dokku checks:run opencodelists). We also captured the logs from the health check. The latter were a subset of the former; that is, the logs from the health check appeared in the logs from the current container.

We conclude that the target of the curl command is the internal IP address of the upcoming container. The health check is valuable, with the caveat that changes to app.json should be deployed twice.

[^1]: As an aside, we triggered the workflow dispatch event to deploy a new Docker image. We compared the SHAs of the current container and the upcoming container to establish that the current container was replaced by the upcoming container.