This PR introduces graceful shutdown functionality to the Multus daemon by adding a /readyz endpoint alongside the existing /healthz. The /readyz endpoint starts returning 500 once a SIGTERM is received, indicating the daemon is in shutdown mode. During this time, CNI requests can still be processed for a short window. The daemonset configs have been updated to increase terminationGracePeriodSeconds from 10 to 30 seconds, ensuring we have a bit more time for these clean shutdowns.
This addresses a race condition during pod transitions where the readiness check might return true, but a subsequent CNI request could fail if the daemon shuts down too quickly. By introducing the /readyz endpoint and delaying the shutdown, we can handle ongoing CNI requests more gracefully, reducing the risk of disruptions during critical transitions.
Major thanks to @deads2k for the find, identification, fix, and of course, the explanations. Appreciate it.
coverage: 63.822% (-0.04%) from 63.857%
when pulling 531dec1c916d746aabf3ad800803ee0a82c8a11b on dougbtv:thickplugin_graceful_term2
into f1e887e2396c98e9aee6417723f2c5cd433a1cd2 on k8snetworkplumbingwg:master.
This PR introduces graceful shutdown functionality to the Multus daemon by adding a
/readyz
endpoint alongside the existing/healthz
. The /readyz endpoint starts returning 500 once a SIGTERM is received, indicating the daemon is in shutdown mode. During this time, CNI requests can still be processed for a short window. The daemonset configs have been updated to increaseterminationGracePeriodSeconds
from 10 to 30 seconds, ensuring we have a bit more time for these clean shutdowns.This addresses a race condition during pod transitions where the readiness check might return true, but a subsequent CNI request could fail if the daemon shuts down too quickly. By introducing the /readyz endpoint and delaying the shutdown, we can handle ongoing CNI requests more gracefully, reducing the risk of disruptions during critical transitions.
Major thanks to @deads2k for the find, identification, fix, and of course, the explanations. Appreciate it.