nestybox / sysbox

An open-source, next-generation "runc" that empowers rootless containers to run workloads such as Systemd, Docker, Kubernetes, just like VMs.
Apache License 2.0
2.8k stars 155 forks source link

Docker buildx workers fail when run inside a sysbox container #384

Closed lox closed 2 years ago

lox commented 3 years ago

I was hoping to use the cache-from and cache-to directives to cache docker layer cache between builds in my CI setup, but ran into an error.

To reproduce:

docker run --runtime=sysbox-runc -it --rm --name test-1 --hostname test-1 ghcr.io/nestybox/ubuntu-focal-systemd-docker:latest

docker buildx create --name mybuilder --use
docker buildx inspect mybuilder --bootstrap

...

error: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: write sysctl key net.ipv4.ping_group_range: write /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown

I suspect this relates to the builder setting a network mode of host.

lox commented 3 years ago

Was hoping this would fix it, but it doesn't:

docker buildx create --name mybuilder --driver docker-container --driver-opt network=bridge --use
rodnymolina commented 3 years ago

This is the problem that we need to address: /proc/sys/net/ipv4/ping_group_range should be writable within a sys container. Should have that done within the next few days.

admin@test-1:~$ docker buildx inspect mybuilder --bootstrap
[+] Building 7.4s (1/1) FINISHED
 => ERROR [internal] booting buildkit                                                                                                                                                                  7.4s
 => => pulling image moby/buildkit:buildx-stable-1                                                                                                                                                     6.0s
 => => creating container buildx_buildkit_mybuilder0                                                                                                                                                   1.4s
------
 > [internal] booting buildkit:
------
Name:   mybuilder
Driver: docker-container

Nodes:
Name:     mybuilder0
Endpoint: unix:///var/run/docker.sock
Error:    Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: write sysctl key net.ipv4.ping_group_range: write /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown
admin@test-1:~$

admin@test-1:~$ sudo ls -lrt /proc/sys/net/ipv4/ping_group_range
[sudo] password for admin:
-rw-r--r-- 1 root root 0 Oct 16 19:14 /proc/sys/net/ipv4/ping_group_range
admin@test-1:~$
admin@test-1:~$ sudo cat /proc/sys/net/ipv4/ping_group_range
65534   65534

References:

lox commented 2 years ago

Looks like they worked around this in https://github.com/docker/buildx/pull/887, will test.

rodnymolina commented 2 years ago

@lox, thanks for letting us know. I'll need to look at that fix in more detail but I suspect that it won't be applicable to Sysbox.

Btw, we already have an internal fix for this (as well as other related issues) to enable buildkit within Sysbox containers. We are currently testing most of the buildkit features (which are a few) and expect to have all this released within a few weeks.

href commented 2 years ago

Is there any workaround, or can we get the internal fix? I would love to use docker buildx, which seems to currently not be possible.

rodnymolina commented 2 years ago

@href, we should be merging all buildx related changes within the next couple of days. Ping me through slack if you want beta access to this image.

href commented 2 years ago

Thanks, next couple of days works for me. Looking forward to it!

rodnymolina commented 2 years ago

@href, @lox, please pull the latest changes (there are a bunch) and give this one a try again. Let me know if any problem.

Thanks.

href commented 2 years ago

I built my own release using 3513b747f52ecaf02a4e1b628795eb66d7330b6a and it seems like the error is still present. Hopefully I'm just doing something wrong.

I used the build instructions found here, copied the binaries to my sysbox host, replaced the binaries in /usr/bin/sysbox and restarted sysbox.service.

For good measure I also restarted my GitLab runner and Docker services.

After failing to get this to work in my runners, I tried the example at the top of this issue. The result is the same everywhere:

admin@test-1:~$ docker buildx create --name mybuilder --use
mybuilder
admin@test-1:~$ docker buildx inspect mybuilder --bootstrap
[+] Building 6.9s (1/1) FINISHED
 => ERROR [internal] booting buildkit                                                                                                                                                                                    6.9s
 => => pulling image moby/buildkit:buildx-stable-1                                                                                                                                                                       6.3s
 => => creating container buildx_buildkit_mybuilder0                                                                                                                                                                     0.6s
------
 > [internal] booting buildkit:
------
Name:   mybuilder
Driver: docker-container

Nodes:
Name:     mybuilder0
Endpoint: unix:///var/run/docker.sock
Error:    Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: write sysctl key net.ipv4.ping_group_range: write /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown

I'm pretty sure my version is correct, as you can see in the journal output of my sysbox service:

Feb 02 21:18:14 lab-shared-ci-runner1-rma1 systemd[1]: Started Sysbox container runtime.
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2084996]: sysbox-runc
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2084996]:         edition:         Community Edition (CE)
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2084996]:         version:         0.4.1
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2084996]:         commit:         a8c3e99f2d20f0766b2eddf80ff565d7d6edc03f
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2084996]:         built at:         Tue Feb  1 12:15:24 UTC 2022
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2084996]:         built by:
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2084996]:         oci-specs:         1.0.2-dev
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085002]: sysbox-mgr
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085002]:         edition:         Community Edition (CE)
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085002]:         version:         0.4.1
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085002]:         commit:         d1f8dfc060fbd8f832c33ac10d43c87aa718efdb
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085002]:         built at:         Tue Feb  1 12:16:05 UTC 2022
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085002]:         built by:
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085008]: sysbox-fs
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085008]:         edition:         Community Edition (CE)
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085008]:         version:         0.4.1
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085008]:         commit:         0e5acbf5dad57d621efb401019bc8895ac540d0f
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085008]:         built at:         Tue Feb  1 12:15:52 UTC 2022
Feb 02 21:18:14 lab-shared-ci-runner1-rma1 sh[2085008]:         built by:

/proc/sys/net/ipv4/ping_group_range seems to be unchanged:

root@f2d627e16883:/# ls -lrt /proc/sys/net/ipv4/ping_group_range
-rw-r--r-- 1 root root 0 Feb  2 20:33 /proc/sys/net/ipv4/ping_group_range
root@f2d627e16883:/# cat /proc/sys/net/ipv4/ping_group_range
65534   65534
rodnymolina commented 2 years ago

You did everything right @href, it's my fault as I missed to merge two commits with relevant changes into our public sysbox-fs repo. And, unfortunately, our CI didn't catch this either coz this being a brand-new feature, we haven't updated our ci-jobs to execute the new buildx-specific testcases ...

Thanks for letting me know. I'll have this fixed in a couple of hours.

rodnymolina commented 2 years ago

@href, changes are merged now.

Another thing, please keep in mind that Sysbox will need to be configured as the default runtime if you are attempting to launch a buildkit container from your host system (buildkit would run at level-1 in this case). Unfortunately, neither buildx nor buildctl clis offer a runtime flag, so you'll need to make use of the following instruction to set this up -- otherwise the buildkit runner will be created with the regular (oci) runc.

rmolina@dev-vm1:~/sysbox$ sudo ./scr/docker-cfg --default-runtime=sysbox-runc

Alternatively, if you are trying to launch buidkit within a sysbox container (buildkit running at level-2), then there's no extra configuration required.

href commented 2 years ago

Thanks, I retried it with the latest build and it worked as expected. Thank you! Looking forward to a release 🙂

rodnymolina commented 2 years ago

@lox, I'll go ahead and close this one now. Please re-open it if have any question.