kata-containers / runtime

Kata Containers version 1.x runtime (for version 2.x see https://github.com/kata-containers/kata-containers).
https://katacontainers.io/
Apache License 2.0
2.1k stars 375 forks source link

POD creation fails with read-only file system #3021

Closed nimajafroodi closed 3 years ago

nimajafroodi commented 4 years ago

Get your issue reviewed faster

To help us understand the problem more quickly, please do the following:

  1. Run the kata-collect-data.sh script, which is installed as part of Kata Containers.
    $ sudo kata-collect-data.sh > /tmp/kata.log
  2. Review the output file (/tmp/kata.log) to ensure it doesn't contain any private / sensitive information
  3. Paste the entire contents of the file into this issue as a comment kata.log

The information provided will help us to understand the problem more quickly so saves time for both of us! :smile:

Description of problem

Creating PODs using CRI-O and kata-runtime fails with kata-agent failing to create directories. The failure is due to an unknown read-only file system.

Expected result

POD and container creations succeed.

Actual result

FATA[0032] run pod sandbox failed: rpc error: code = Unknown desc = container create failed: rpc error: code = Internal desc = Could not run process: container_linux.go:345: starting container process caused "process_linux.go:424: container init caused \"rootfs_linux.go:58: mounting \\"proc\\" to rootfs \\"/run/kata-containers/shared/containers/7ffea0a7a06511fb2591b3370f9d4e6821d52dfcad2105ca001b5b44c7b1fedb/rootfs\\" at \\"/proc\\" caused \\"mkdir /run/kata-containers/shared/containers/7ffea0a7a06511fb2591b3370f9d4e6821d52dfcad2105ca001b5b44c7b1fedb/rootfs/proc: read-only file system\\"\""

Further information

Hi. I am trying to use kata-runtime as the OCI compatible runtime for managing pods and containers. The approach I am taking is to have rootless environment inside which both CRI-O and kata-runtime run (something similar to have a docker image which bundles crio and kata-runtime, and to use it as nested dockers). In my test environment I am able to manage images using CRI-O, but POD creation fails. What I am observing is that CRI-O successfully pulls down the pause image, and invokes kata-runtime which creates the root file system for the pod on the host and spawns the Qemu process. Once the Qemu VM is up and running I can see the kata-agent is executed but is terminated immediately as it is not able to bind mount the rootfs from the host using the 9pfs storage driver. It seems that the VM's root file system is mounted as readonly by design which confuses me of why the kata-agent even tries to make any directories. Sorry if I am making any wrong assumption here as I still don't fully understand the bits inside the agent. This is probably not a bug and I opened this ticket purely for getting some help on how to proceed with debugging this issue and to more understand what's happening and what could be the potential problem.

debug_logs: pod_error.log

nimajafroodi commented 4 years ago

A gentle reminder for this issue

devimc commented 4 years ago

did you try adding rw to the kernel cmdline?

https://github.com/kata-containers/runtime/blob/master/cli/config/configuration-qemu.toml.in#L30

configuration files: /etc/kata-containers/configuration.toml or /usr/share/default/kata-containers/configuration.toml

nimajafroodi commented 4 years ago

Yes and that didn't resolve the problem. What is strange is that I don't get this error using CRI-O + kata-runtime directly on the host. The error happens when running CRIO and kata-runtime inside another containerized environment like docker. Do you happen to have any documentation on how to run kata-runtime inside a docker environment that is already spawned with runc? Is there any technical issue of doing so?

devimc commented 4 years ago

Do you happen to have any documentation on how to run kata-runtime inside a docker environment that is already spawned with runc? Is there any technical issue of doing so?

afaik, No, there is no document, are you running the runC container with --privileged? I think you need it. Take a look to these two issues, might be you can find something useful https://github.com/kata-containers/runtime/issues/2618 https://github.com/kata-containers/runtime/issues/358

nimajafroodi commented 4 years ago

afaik, No, there is no document, are you running the runC container with --privileged? I think you need it.

Yes.

nimajafroodi commented 4 years ago

Thanks for the posted links. I will get back to you after reading those posts.

nimajafroodi commented 3 years ago

This issue can be closed.

We have root caused the issue. The issue was related to mount propagation of the shared directory which by default is set to the propagation of it's mount peer group. In our case the propagation was set to private preventing the bind mount of the pod rootfs to be visible to the guest.

devimc commented 3 years ago

@nimajafroodi thanks, closing issue..