Closed levibostian closed 4 years ago
After further inspection, the 503 response is sent by terminus. I thought that nginx ingress sent the 503 when the pods were down. I did not realize this response is sent.
This makes me think, however, Postmark healthcheck might be good for a readiness check when the app is starting up to make sure that the client is setup correctly but it's not a core part of the app such as a redis or postgres DB. Let's move the postmark check to the readiness check when the app starts up.
Done
I have an app deployed on k8s. I experienced some downtime recently for my app.
At the same time that the downtime happened, I received a honeybadger report. what was happening was this:
Expected outcome
Healthchecks return false but do not crash. Then k8s will not direct traffic to that pod.
I might want to consider everwhere else in my app this could happen. When an exception happens in the app, do we want the app to restart? Is that intended behavior? Restarting should happen only when an error is uncaught which means it was not handled. We need to have more caught exceptions which return 500 and not crash the pod to prevent downtime. Downtime is bad because then all other endpoints from other clients cannot communicate!