nestybox / sysbox

An open-source, next-generation "runc" that empowers rootless containers to run workloads such as Systemd, Docker, Kubernetes, just like VMs.
Apache License 2.0
2.78k stars 152 forks source link

sysbox error with docker 20.10.13 #502

Closed dmarteau closed 2 years ago

dmarteau commented 2 years ago

I have a docker-compose with network defined:

networks:
  lzmcloud_net:
     ipam:
       driver: default
       config:
         - subnet: 172.200.0.0/16

services are configured like

services:
   master:
      ...
      runtime: sysbox-runc
      networks:
          lzmcloud_net:
              ipv4_address: 172.200.0.2

With docker 20.10.12 all was working perfectly, I just upgraded the docker-ce/docker-ce-cli/containerd to

docker-ce                                              5:20.10.13~3-0~ubuntu-bionic
containerd.io                                          1.5.10-1 
docker-ce-cli                                          5:20.10.13~3-0~ubuntu-bionic

And now starting the containers fail with the following errors:

Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:393: 
starting container process caused: process_linux.go:607: container init caused: 
write sysctl key net.ipv4.ping_group_range: write /proc/sys/net/ipv4
/ping_group_range: invalid argument: unknown
Makefile:56: recipe for target 'up' failed
make: *** [up] Error 1

Note that there is no problem with the docker default runtime: container starts as expected with same custom network config.

I confirm that rolling back to docker-ce 20.10.12 then all work perfectly.

dmarteau commented 2 years ago

Maybe this is related to the following change (From https://docs.docker.com/engine/release-notes/#201013):

From https://docs.docker.com/engine/release-notes/#201013

rodnymolina commented 2 years ago

@dmarteau, thanks for filing this up and for the docker pointer. Looks like we'll need to look into this right away. Will get back to you asap.

rodnymolina commented 2 years ago

As suggested by @dmarteau, the problem is a side effect of this recent docker change.

These changes allow ping (ICMP) traffic to be sourced within regular unprivileged containers, but this is only possible when containers are running in userns-mode=host (i.e. no dedicated user-namespaces). This is enforced by having Docker verifying that userns-remap feature is not enabled -- see that they only write into ping_group_range if userns-remap is turned off. This approach works fine when docker operates with the oci-runc, but it breaks when dealing with Sysbox (where user-namespaces are always utilized/enforced).

Luckily we had indirectly addressed this issue as part of the buildkit-support feature (already merged), so a fix for this one will come in our next official release (v0.5.0) which is expected within the next couple of days.

fhaefemeier commented 2 years ago

These changes allow ping (ICMP) traffic to be sourced within regular unprivileged containers, but this is only possible when containers are running in userns-mode=host (i.e. no dedicated user-namespaces). To do so, Docker first checks if userns-remap feature is enabled and only if that's the case then it write into ping_group_range sysctl. This approach works fine when docker operates with the oci-runc, but it breaks when dealing with Sysbox (where user-namespaces are always utilized/enforced).

@rodnymolina it means the issue happen for userns-remap enabled docker daemon and has no impact without it. Right?

rodnymolina commented 2 years ago

@fhaefemeier, no, this issue will happen whenever you launch a sysbox container while having the very latest Docker version installed (i.e., 20.10.13). I re-adjusted my previous comment above to make it clearer.

Btw, Sysbox v0.5.0 release was deferred a few extra days due to a last-minute issue which ended up being a false alarm. The new ETA is 03/21.

rodnymolina commented 2 years ago

Fix went into Sysbox v0.5.0 release. Please re-open if you have any issues.