krakend / krakend-ce

KrakenD Community Edition: High-performance, stateless, declarative, API Gateway written in Go.
https://www.krakend.io
Apache License 2.0
1.92k stars 452 forks source link

Graceful restart with tableflip #787

Open fxedel opened 11 months ago

fxedel commented 11 months ago

Version of KrakenD you are using

KrakenD Version: 2.4.3
Go Version: 1.20.6
Glibc Version: MUSL-1.2.4_(alpine-3.18.2)

Is your feature request related to a problem? Please describe. Restarting krakend always comes with a short downtime on that machine, as the old process is shutting down, thus closing the HTTP listen socket, and then a new process is starting up, doing some initializing and only then starts listening. Usually, high availability for an API gateway is desired.

Describe the solution you'd like Implement graceful restart via cloudflare/tableflip. The restart works like so:

If the new process fails during initialization, such as panicking due to an invalid config file, or exceeding a configurable startup timeout, the old process won't shut down and still serves requests. Therefore, it's ensured that at any time, there is a usable krakend process running.

This graceful restart strategy is in fact inspired by nginx reloads, see Cloudflare's blogpost.

Describe alternatives you've considered The documentation recommends using blue/green deployments. While this can be straightforward in a Kubernetes or Cloud setup, it might not be usable in all situations. Having a simple builtin graceful restart functionality, just like nginx, makes it possible to update the configuration with zero downtime and without changing anything in the server infrastructure. I would consider this as an alternative restart option, so we have different options that are more or less suited for different setups.

github-actions[bot] commented 8 months ago

This issue is marked as stale because it has been open over 90 days with no activity. Remove the stale label or comment or this will be closed in 15 days.

github-actions[bot] commented 5 months ago

This issue is marked as stale because it has been open over 90 days with no activity. Remove the stale label or comment or this will be closed in 15 days.