nestybox / sysbox

An open-source, next-generation "runc" that empowers rootless containers to run workloads such as Systemd, Docker, Kubernetes, just like VMs.
Apache License 2.0
2.83k stars 159 forks source link

Can not install QEMU for set multi arch on Pod #823

Open Rory-Z opened 3 months ago

Rory-Z commented 3 months ago

Hello guys, I installed sysbox on AWS EKS 1.29 like https://github.com/nestybox/sysbox/issues/820, and I created pod like this

apiVersion: v1
kind: Pod
metadata:
  name: dind
  annotations:
    io.kubernetes.cri-o.userns-mode: "auto:size=65536"
spec:
  runtimeClassName: sysbox-runc
  containers:
  - name: dind
    image: docker:dind

Now I can running docker command in pod without privileged for pod, all looks good, it's amazing

But when I install https://github.com/tonistiigi/binfmt for docker buildx, I got error:

$ docker run --rm --privileged docker.io/tonistiigi/binfmt:latest --install all
error: operation not permitted
cannot mount binfmt_misc filesystem at /proc/sys/fs/binfmt_misc
main.run
    /src/cmd/binfmt/main.go:183
main.main
    /src/cmd/binfmt/main.go:170
runtime.main
    /usr/local/go/src/runtime/proc.go:250
runtime.goexit
    /usr/local/go/src/runtime/asm_amd64.s:1571

Any ideas ?

ctalledo commented 1 month ago

Hi @Rory-Z, thanks for filing the issue.

Yes, it's currently not possible to mount binfmt_misc inside a Sysbox container:

/proc/sys/fs # mount -t binfmt_misc none /proc/sys/fs/binfmt_misc
mount: permission denied (are you root?)

Adding support for this is tricky, because binfmt_misc is not namespaced, meaning that if a container registers a binfmt handler for a particular binary format, that will affect the host and other containers, which is not good. Ideally, the registration would be specific to that container only, and not affect the host or other containers.

At this point we don't have cycles to support it unfortunately.

rodnymolina commented 1 month ago

@Rory-Z, this may not help you, and you are probably already aware of this, but building multi-arch images with these emulators is usually way slower than doing it in native platforms, so I would allocate different k8s nodes for this purpose.

Rory-Z commented 1 month ago

@rodnymolina @ctalledo Thanks for answer, I'm sorry to header that we have no way to support it, as @rodnymolina to say, I will try to allocate different k8s nodes for this purpose. I also to keep follow this issue, if we have any update in the feature, I will try it in first time. Thanks again