Closed egernst closed 6 years ago
This looks like a bug in crio. If you force an error running the runtime standalone...
... you get the expected:
$ sudo cc-runtime run --bundle "$bundle" baz >stdout 2>stderr
$ cat stdout
$ cat stderr
ERROR received from VM agent, control msg received : Process could not be started: container_linux.go:296: starting container process caused "exec: \"blablab\": executable file not found in $PATH"
The problem appears to be here....
... and here:
meaning the runtimes error message goes to the system journal. That appears to be happening, but the journal entries make it appear the error is coming from the crio daemon itself, which is clearly incorrect.
With CRIO+cc-runtime, if you start a container workload with a garbage command to execute, the exact error is not passed back to crictl and the container is left in an unknown state. Errors are observed at container start time rather than creation time (as is observed when using runc) -- this isn't really an issue; the issue is that the container is reported in unknown state, and from perspective of CRIO cannot be removed. On top of this, it appears that the error itself isn't propagated back up to crictl, just a generic "error - status 1"
To easily reproduce, grab sample pod/container json:
Expected Behavior, per runc:
When running with runc, the failure occurs at the create container time, and cleanup occurs without issue:
Issue observed when using cc-runtime:
Looking @ journal I see some of the error messages from proxy which would've been reported to crictl in the runc case:
At this point, ps shows that the container is in an unknown state:
And now we cannot remove the container, and thus cannot remove the sandbox either: