Constellation is the first Confidential Kubernetes. Constellation shields entire Kubernetes clusters from the (cloud) infrastructure using confidential computing.
GNU Affero General Public License v3.0
903
stars
47
forks
source link
debugd: reset unit failed status before restarting #3183
On debug images, our bootstrapper and upgrade-agent systemd units continuously fail until the binaries are uploaded using cdbg deploy.
Once this is done, the debugd issues a systemctl restart command to those units.
Under some circumstances, this restart can fall into the timeout period from our units being rate limited in restarting by systemd.
We can use systemctl reset-failed to reset the failed counter and by this way bypass the timeout.
Context
On debug images, our bootstrapper and upgrade-agent systemd units continuously fail until the binaries are uploaded using
cdbg deploy
. Once this is done, thedebugd
issues asystemctl restart
command to those units. Under some circumstances, this restart can fall into the timeout period from our units being rate limited in restarting by systemd. We can usesystemctl reset-failed
to reset the failed counter and by this way bypass the timeout.See https://bugzilla.redhat.com/show_bug.cgi?id=1016548 for some more details about the issue. See the man page for
systemctl reset-failed
.Proposed change(s)
systemctl reset-failed
before restarting a unit indebugd
Related issue
Additional info
ref/fix-debugd-unit-restarting/stream/debug/v2.17.0-pre.0.20240620090724-9385e634a613
Checklist