nestybox / sysbox

An open-source, next-generation "runc" that empowers rootless containers to run workloads such as Systemd, Docker, Kubernetes, just like VMs.
Apache License 2.0
2.73k stars 151 forks source link

/dev/kvm not accessible inside docker container when running with sysbox #717

Open matejdro opened 1 year ago

matejdro commented 1 year ago

We need access to kvm in our docker machines to open android emulator instances.

We pass kvm to docker container via --device=/dev/kvm flag. This works fine without sysbox, but when --runtime=sysbox-runc is also present at container start, Android Emulator will not start, but it will complain that This user doesn't have permissions to use KVM (/dev/kvm).. User inside container is root.

Any idea what could cause this and how could we get it to work?

ctalledo commented 1 year ago

Hi @matejdro, thanks for giving Sysbox a try.

Unfortunately accessing /dev/kvm inside the Sysbox container is not yet supported (and we don't have short term plans to do this).

The problem you are likely facing is that Sysbox containers use a "fake root" (i.e., Linux user-namespace), so root in the sysbox container is not the host's real root and therefore has no permissions to access the /dev/kvm device (which shows up with nobody:nogroup ownership inside the container).

To overcome this, Sysbox would need to make that device show up with proper permissions inside the container, either by leveraging shiftfs, ID-mapped-mounts, or emulating the /dev/kvm device. Unfortunately, shiftfs is being deprecated, ID-mapped-mounts won't likely work on /dev/, and Sysbox emulating /dev/kvm would work but may have a performance hit (not sure how much).

Out of curiosity, what is the use case you are after? (if you can share).

Thanks!

matejdro commented 1 year ago

Our use case is having a single docker image for CI that can handle both Android instrumented tests (for which we need access to /dev/kvm) and running docker commands inside (docker-in-docker) for backend tests and building.

SkyperTHC commented 1 year ago

special consideration is needed so that the container can't get around the container's cgroup restrictions (cpu accounting etc) if the host's /dev/kvm is directly exposed to the container, not?

ctalledo commented 12 months ago

special consideration is needed so that the container can't get around the container's cgroup restrictions (cpu accounting etc) if the host's /dev/kvm is directly exposed to the container, not?

Likely yes; whenever you expose hardware directly to the container, the problem of how to ensure containers are bounded on their use of that resource comes up and it can be challenging to solve unless the kernel has the constructs that would allow that resource to be limited.

jukuli commented 1 week ago

Any known workarounds for this yet? Would like to use sysbox for all docker and dind cases, but currently have some android test cases that need /dev/kvm .