bluesky / tiled

API to structured data
https://blueskyproject.io/tiled
BSD 3-Clause "New" or "Revised" License
59 stars 50 forks source link

Update `/healthz` endpoint when shutting down #780

Closed danielballan closed 1 month ago

danielballan commented 2 months ago

In the shutdoown ASGI lifecycle hook...

https://github.com/bluesky/tiled/blob/7f7329de1b4ab39f502075656102585cdcc35f7c/tiled/server/app.py#L668

...

We need the app to respond to SIGTERM but updating some state, maybe on app.state, that the /healthz endpoitnt can reference. This should cause /healthz to return 500 status code because that is what k8s (and other standard tools) actually check. The body (JSON...) is for humans.

The app needs to wait long enough for HAproxy (or k8s or whatever) to poll /healthz. I think we'll need this to be configurable somehow because for dev and small deployments we do not want to wait ~10 for the app to shutdown after receiving SIGTERM.

danielballan commented 2 months ago

Sequence of events:

  1. Application receives SIGTERM (or SIGINT if it is being running in a terminal and stopped with ^C).
  2. Application updates /healthz.
  3. Application waits for HAproxy to poll.
  4. Application stops accepting new connections.
  5. Application waits for requests in progress to finish.
  6. Applications terminates.
  7. Process exits.
danielballan commented 1 month ago

We concluded that this is technically viable but not the best approach.