Closed schu closed 7 years ago
It does rely on the fact that run programs actually report ready to systemd-run
, which seems to be true for rkt app sandbox
(or rather systemd
, as run by rkt app sandbox
).
Currently, StartProcess
is only used from RunPodSandbox
.
Yes, and support for sd_notify was improved in rkt v1.15.0 so rkt handles this correctly now.
This looks a bit fishy. You are getting a generic error exit code (NOT the child one), because the process exits before signaling (violating the sd_notify protocol):
$ systemd-run --service=notify /bin/bash -c "exit 2" ; echo $?
1
This looks a bit fishy. You are getting a generic error exit code (NOT the child one), because the process exits before signaling (violating the sd_notify protocol):
I am not sure what you are suggesting. rkt app sandbox
should follow the sd_notify protocol, unless it fails before starting systemd-nspawn
.
rkt app add
probably does not follow the fd_notify
protocol but do we use systemd-run
for thoses? If it is the case, I guess we will notice it:
$ sudo systemd-run --service=notify /bin/bash -c "exit 0" ; echo $?
Job for run-rf6c75562d5304b1db59655ef65bf8e10.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status run-rf6c75562d5304b1db59655ef65bf8e10.service" and "journalctl -xe" for details.
1
I am not sure what you are suggesting. rkt app sandbox should follow the sd_notify protocol, unless it fails before starting systemd-nspawn.
rkt app sandbox
could sd_notify
an ERRNO
, I guess.
You are getting a generic error exit code (NOT the child one), because the process exits before signaling (violating the sd_notify protocol)
Yes. While it's a generic error code, it makes StartProcess
return an error message for a transient unit that fails. It could be argued that StartProcess
should only care about the sucess / failure of systemd-run
, but in the current use case waiting for READY=1
seems more appropriate to me (make sure the actual error is returned to the kubelet; avoid waiting for uuid file unnecessarily).
rkt app sandbox could sd_notify an ERRNO, I guess.
Let's track this in #123 and we can merge this PR as-is.
Otherwise, an error in the run command is not detected and
systemd-run
returns 0 in any case. (systemd-run false ; echo $?
versussystemd-run --service=notify false ; echo $?
)With this change, an error in a transient unit (as described in https://github.com/kubernetes-incubator/rktlet/issues/108) would be easier to debug, as
RunPodSandbox
now returns the actual error to the kubelet.