rkt / rkt

[Project ended] rkt is a pod-native container engine for Linux. It is composable, secure, and built on standards.
Apache License 2.0
8.82k stars 883 forks source link

kvm stage1 should return pod exit status #2777

Open lucab opened 8 years ago

lucab commented 8 years ago

While writing test cases, I realized that kvm flavors always return 0 even if the application exit with some other exit codes. Behavior should be aligned with non-kvm flavors, where user/caller gets a meaningful exit status upon termination. I think @alban and others were already aware of this, but I didn't find any report tracking it.

/cc @jellonek

tmrts commented 8 years ago

@pskrzyns can you take a look at this one?

pskrzyns commented 8 years ago

I'm on vacation till monday :) @jjlakis @lukasredynk could you look at it ?

pskrzyns commented 8 years ago

@tmrts @alban can we create team kvm on this repo ?

tmrts commented 8 years ago

@pskrzyns :+1:

iaguis commented 8 years ago

@coreos/rkt-kvm-maintainers can you set a proper milestone here?

iaguis commented 8 years ago

I'll move it to v1.12.0

jjlakis commented 8 years ago

@iaguis Thanks

lucab commented 8 years ago

Bump, @jjlakis any chance you can have a look at this?

jellonek commented 8 years ago

I see no other way to support this than running hypervisor engine (lkvm, qemu) by other process, and then collect status from remains of pod as this is already done by rkt status. So rkt binary would not exec directly to hypervisor engine, but to this overlapping helper binary. This looks annoying but seems to be very simple in implementing... Imo this binary should be statically linked C program, small binary like enter, prepare-app because of golang overhead...

lucab commented 8 years ago

I see. I was probably too optimistic asking for stage1 to just bubble-up the return value from systemd, but skimming a bit I realized lkvm/bzimage have no concept of a return value (uml has, though).

Thinking a bit more on this, the binary wrapper may not even be such a great solution because we don't want to introduce another long-running component, and it will miss anyway errors from failed stage1 setup (while stage2 status can be already retrieved the way you mentioned).

The original context for this was that we have some test units using this feature on the coreos flavor, and we end up not running some tests on kvm due to this. At this point, however, I think it is better to adjust the testsuite to be aware of this, instead of introducing other moving parts here.

De-milestoning but keeping this around in case we get better support from the intermediate layers in the future.

grahamwhaley commented 7 years ago

Just as an update/note here - the same situation exists for other VM container systems. In Clear Containers we have gone with the 'intermediate binary' approach, where we have an extra process that reaps the exit code and returns it to the higher level. A fundamental issue is that (for instance) qemu does not return the exit code of its payload, but of itself - thus, directly running qemu will never get you the payload exit code, unless we could add a feature to qemu. When we come to re-visit this issue we can look at re-using the Clear Containers code. CC @jodh-intel