shim: Start shims inside appropriate namespaces

sboeuf commented 6 years ago

This PR first creates a new package nsenter so that we can spawn any code into a set of namespaces. Then, the virtcontainers code is modified to rely on this new package. At last, it starts shims from agents within the right PID namespace. In details, a shim representing a container process should enter its own PID namespace so that we can spawn other shims corresponding to exec'ed processes inside the same namespace. This is needed if we don't want some race conditions to happen, due to the exec'ed shims to never return their exit codes if the container process returned first.

Fixes #613

amshinde commented 6 years ago

@sboeuf Did some digging up on the issues filed surrounding the mount namespace issue when I was implementing support for devices. See this: https://github.com/golang/go/issues/8676

Also, that issue has code that checks if the namespace change was successful, we could do something similar in our tests: https://storage.googleapis.com/go-attachment/8676/0/checkns.c

amshinde commented 6 years ago

@sboeuf Doing things the way you are doing currently works (most of the time) and provides for a simple solution (instead of reexecing a new process), but I came across this article which does state that it is not 100% foolproof : https://www.weave.works/blog/linux-namespaces-and-go-don-t-mix

Basically golang may spawn a new thread for a blocking syscall even after LockOsThread() We have an issue in vishwanda/netlink repo related to the above: https://github.com/vishvananda/netns/issues/17

The issue seems to have been resolved in golang 1.10, but we may see an unsuccessful setns call in earlier versions.

Doing things in the shim code, may not be such a bad approach after all.

sboeuf commented 6 years ago

@jodh-intel PR updated with almost full coverage for unit tests and more than that a real test that we're entering namespaces. As @amshinde explained, we have some limitations with Go, even if this code is supposed to be foolproof.

About limitations, both mount and user namespaces cannot be entered with the Setns() syscall from a Go program because it's multithreaded. I honestly don't get the "why" of this limitation but after all I have read and my local testing, I can confirm we cannot enter those namespaces if we don't involve some C code, or the use of an external binary such as nsenter.
About being foolproof, I think this is safe since we know that we should not pass a callback involving a go routine through this package, otherwise it won't be executed in the expected namespace. Also, the second issue about having some new threads being created in case the current one takes too long should not happen in our case and will be solved with Go 1.10.

Basically, this package does the most that a Go package can do with netns. So my proposal is that we could take it in since it will help spawning our shims into expected PID and network namespaces, and worst case, it won't do any harm in case we hit the issues explained above once every 10000000000 times.

At least this whole PR brings the structure for the shims to be switched to the expected namespaces.

I think we should try to look into a better way to solve this, but this will involve more researches.

sboeuf commented 6 years ago

@sameo @jodh-intel @amshinde PR is ready for another review ;)

sboeuf commented 6 years ago

@amshinde PR updated, PTAL

sboeuf commented 6 years ago

@grahamwhaley @jodh-intel ready to be merged

containers / virtcontainers

shim: Start shims inside appropriate namespaces #637