CAAPIM / apim-charts

Helm Charts for Layer7 API Management components.
MIT License
11 stars 48 forks source link

charts/gateway added gateway container lifecycle configuration #228

Closed Gazza7205 closed 1 year ago

Gazza7205 commented 1 year ago

Description of the change Updates to Gateway Container Lifecycle.

During upgrades and other events where Gateway pods are replaced you may have APIs/Services that have long running connections open.

This functionality delays Kubernetes sending a SIGTERM to the container gateway while connections remain open. This works in conjunction with terminationGracePeriodSeconds which should always be higher than preStopScript.timeoutSeconds. If preStopScript.timeoutSeconds is exceeded, the script will exit 0 and normal pod termination will resume.

The graceful termination (preStop script) is disabled by default.

Parameter Description Default
lifecycleHooks Custom lifecycle hooks, takes precedence over the preStopScript {}
preStopScript.enabled Enable the preStop script false
preStopScript.periodSeconds The time in seconds between checks 3
preStopScript.timeoutSeconds Timeout - must be lower than terminationGracePeriodSeconds 60
preStopScript.excludedPorts Array of ports that should be excluded from the preStop script check [8777, 2124]
terminationGracePeriodSeconds Default duration in seconds kubernetes waits for container to exit before sending kill signal. see values.yaml

Benefits Existing connections during pod termination are respected for a configurable period of time

Applicable issues DE569702

Checklist

burbanski commented 1 year ago

Hello, @Gazza7205 . We should make clear in the readme that the preStopScript script will monitor connections to inbound (not outbound) gateway application TCP ports (i.e. inbound listener ports opened by the gateway application and not some other process) except those that are explicitly excluded (and we should explain why 8777 and 2124 are excluded by default).

We should also make clear that the preStopScript will exit immediately when it detects no open connections on the ports it's monitoring, and not wait for the full preStopScript.timeoutSeconds.

We should perhaps also say that preStopScript.timeoutSeconds and terminationGracePeriodSeconds have no theoretical limits, but that other configuration and events in a kubernetes environment could supersede them, and as such, they become less reliable over longer periods. Graceful termination of very long running workloads might be best handled by a separate differently configured and managed deployment (e.g. HPA disabled) dedicated to those workloads.