containers / virtcontainers

A Go package for building hardware virtualized container runtimes
Apache License 2.0
139 stars 43 forks source link

container: Change block device hotplug ordering to get it effective on Kata Containers agent #593

Closed sboeuf closed 6 years ago

sboeuf commented 6 years ago

The goal of this pull request is to enable the support for block devices for Kata Containers. It appeared that our sequence in container.go was tied to the hyperstart agent expecting things to be hotplugged only at the start stage. In case of Kata Containers, we need this to happen at the creation stage. This won't hurt the case for the hyperstart agent, and this will make it work for Kata too. Now this pull request also solves other issues about implicit update of the mount list that has been discovered while trying to solve the initial problem. That's why it also refactor a bit the way bind mounts from each agent implementation actually updates the list of mounts held by the container structure.

jodh-intel commented 6 years ago

lgtm

Have we got tests in place to check these changes though? Have you run @grahamwhaley's magic https://github.com/clearcontainers/tests/blob/master/integration/stability/soak_parallel_rm.sh on this?

@grahamwhaley - actually, is that script ready to be run as part of the CI checks now dyt? If so, could you raise an issue for that?

Approved with PullApprove Approved with PullApprove

grahamwhaley commented 6 years ago

@jodh-intel we need to check and probably rework that test a little. I think it needs two 'updates':

/me notes all the CIs look to have failed ?

egernst commented 6 years ago

Can you clarify why the different in kata-agent (at creation) versus cc-agent (at start)? Any impact performance wise of doing the hot plugs earlier?

Shouldn't we still be able to hot-plug devices later as well, or this just specific to the container workload rootfs?

sboeuf commented 6 years ago

@egernst the code taking care of the rootfs hotplug (in case of devicemapper) has to be called before we try to create a container from the agent in case of Kata, otherwise the container won't be properly created. In case of hyperstart, we were doing this at the start stage because the call to agent.createContainer() was actually a no-op in this case. I think that Hyperstart is more a particular case, but the logic is to prepare everything (rootfs, mounts, devices) so that the container can be properly created without missing any resources. That's why I globally moved this code from container.start() to createContainer(). We were doing a lot of things from the start stage, but we actually realized that most of the work has to be performed at creation time. This way, the start stage becomes a simple trigger to actually start the container workload. Does that make sense ?

sboeuf commented 6 years ago

Note on this PR. I think I have found why the CRIO tests were not passing for devicemapper case only (ubuntu 17.10). All the failing tests were dealing with created containers that were never started. In this case when the container is stopped by CRIO, we get a kill signal, meaning we kill the shim without passing the signal to the agent (not supported), and we set the container state to STOPPED. For this reason, the container.stop() is called (when CRIO triggers a cc-runtime state) but it returns immediately because the container is already stopped. In the context of this PR moving the rootfs creation from container.start() to createContainer(), this is the root cause of the observed issues... Indeed, container.stop() not being called was not an issue until now, because a CREATED container had not done any extra mount or hotplug.

sboeuf commented 6 years ago

@sameo please review it, I have managed to get CRI-O issues fixed by allowing a container to be STOPPED from a READY state (which is a valid case). To avoid any confusion, issue #606 is a different issue which needs to be addressed separately.

sameo commented 6 years ago

LGTM

Approved with PullApprove Approved with PullApprove