youki-dev / youki

A container runtime written in Rust
https://youki-dev.github.io/youki/
Apache License 2.0
6.31k stars 346 forks source link

Wrong directory when using a tenant container and w/ a mount namespace #2751

Open jeromegn opened 7 months ago

jeromegn commented 7 months ago

I'm unable to use a tenant container for executing commands in a container because, as soon as setns(mnt_ns) is called, the tenant's container_init_process ends up in / of my host.

The situation is different from most usage of libcontainer:

Much debug logging later: I found that after calling apply_rest_namespaces, my current directory shifts to the root / outside of my dummy bind mount. To be clear, the directory listing after setns returns ["/", "/newroot"]. I'm expecting the process to "be" in /containers/my-container/rootfs.

Logs I added **Main container:** ``` DEBUG unsharing for Mount DEBUG after ns Mount entry: Ok(DirEntry("./upper")) DEBUG after ns Mount entry: Ok(DirEntry("./lower")) DEBUG after ns Mount entry: Ok(DirEntry("./bundles")) DEBUG after ns Mount entry: Ok(DirEntry("./containers")) DEBUG after ns Mount entry: Ok(DirEntry("./etc")) DEBUG after ns Mount entry: Ok(DirEntry("./root")) DEBUG after ns Mount entry: Ok(DirEntry("./run")) DEBUG after ns Mount entry: Ok(DirEntry("./sys")) DEBUG after ns Mount entry: Ok(DirEntry("./proc")) DEBUG after ns Mount entry: Ok(DirEntry("./dev")) ... DEBUG prepare rootfs rootfs="/bundles/sleep/rootfs" DEBUG mount root fs "/bundles/sleep/rootfs" ... DEBUG pivoting root to /bundles/sleep/rootfs ``` **Tenant:** ``` DEBUG unshare or setns: LinuxNamespace { typ: Cgroup, path: Some("/proc/242/ns/cgroup") } DEBUG setns for Cgroup at /proc/242/ns/cgroup DEBUG after ns Cgroup entry: Ok(DirEntry("./upper")) DEBUG after ns Cgroup entry: Ok(DirEntry("./lower")) DEBUG after ns Cgroup entry: Ok(DirEntry("./bundles")) DEBUG after ns Cgroup entry: Ok(DirEntry("./containers")) DEBUG after ns Cgroup entry: Ok(DirEntry("./etc")) DEBUG after ns Cgroup entry: Ok(DirEntry("./root")) DEBUG after ns Cgroup entry: Ok(DirEntry("./run")) DEBUG after ns Cgroup entry: Ok(DirEntry("./sys")) DEBUG after ns Cgroup entry: Ok(DirEntry("./proc")) DEBUG after ns Cgroup entry: Ok(DirEntry("./dev")) DEBUG unshare or setns: LinuxNamespace { typ: Mount, path: Some("/proc/242/ns/mnt") } DEBUG setns for Mount at /proc/242/ns/mnt DEBUG after ns Mount entry: Ok(DirEntry("./newroot")) DEBUG after ns Mount entry: Ok(DirEntry("./fly")) DEBUG after ns Mount entry: Ok(DirEntry("./root")) DEBUG after ns Mount entry: Ok(DirEntry("./dev")) ```

I don't understand how that's happening. I tried changing various settings and code in my youki fork, but I can't get it to do anything else. I've confirmed that the process is getting a different mount namespace based on its inode. I've tried spawning threads to better isolate libcontainer operations (and all the syscalls it is making), but that didn't change anything.

At this point I'm wondering if this is happening because of the weird bind mount I'm making from my initrd program. It could also be related to the fact that I'm creating containers all from the same process.

I'm using the default oci_spec::Spec with minimal changes to support host networking (#2745).

This is all leading to this exec error because it can't find the uname program I'm trying to run (rightly slow, it's not in the right mount namespace!):

ERROR executable for container process not found in PATH executable="uname"
ERROR failed to initialize container process: executable 'uname' not found in $PATH
ERROR failed to wait for init ready: exec process failed with error error in executing process : executable 'uname' not found in $PATH
ERROR failed to run container process err=Channel(ExecError("error in executing process : executable 'uname' not found in $PATH"))
jeromegn commented 7 months ago

I figured it out. If you're running from an initramfs (e.g. as part of an initrd program) with no "real" filesystem, you need to use mount --move and chroot instead of pivot_root: https://github.com/opencontainers/runc/blob/5e0ec3fbbff07ef4016c04ae584f18b27bbcfdc1/libcontainer/SPEC.md?plain=1#L118-L125

I might do a PR that detects this niche case if it's of any interest.

utam0k commented 5 months ago

Hi, @jeromegn. This issue may have been fixed by https://github.com/containers/youki/pull/2780. May I ask you to confirm it by the latest libcontainer?