nestybox / sysbox

An open-source, next-generation "runc" that empowers rootless containers to run workloads such as Systemd, Docker, Kubernetes, just like VMs.
Apache License 2.0
2.78k stars 152 forks source link

Sysbox unable to handle read-only rootfs requirement #151

Closed rodnymolina closed 3 years ago

rodnymolina commented 3 years ago

Problem was initially observed in a Cloudron setup. For security purposes, Cloudron sets container's spec so that its rootfs is mounted as read-only.

There are two different issues being reported here:

1) Sysbox expects 'read-write' rootfs in received OCI spec. We already had a fix for this one, but it hasn't been merged yet. This prevents read-only container from properly launching:

$ docker run --runtime=sysbox-runc -it --rm --read-only --name test-1 nestybox/ubuntu-focal-systemd-docker
docker: Error response from daemon: OCI runtime create failed: error in the container spec: invalid or unsupported container spec: root path must be read-write but it's set to read-only: unknown.

2) In scenarios involving docker's custom-networks (which is the case in Cloudron), Sysbox adjusts the DNS resolution process to avoid forwarding issues in nested container setups. As part of this adjustment, sysbox-runc writes into the container's /etc/resolv.conf file. However, this is not allowed in read-only setups. This triggers the following failure:

$ docker run --runtime=sysbox-runc -it --rm --read-only --name test-2 --net=rodny-testing ubuntu
docker: Error response from daemon: OCI runtime create failed: container_linux.go:364: starting container process caused "process_linux.go:533: container init caused \"switching Docker DNS: rootfs_linux.go:1247: writing /etc/resolv.conf caused \\\"open /etc/resolv.conf: read-only file system\\\"\"": unknown.

/cc @vrobm

gramakri commented 3 years ago

(co-founder of Cloudron) . Let me know if you need any clarifications on the Cloudron setup. Do you know of any workaround for the DNS resolution? We have an unbound server on the host and we require the containers to resolve via that DNS server. This is required for email server dnsbl lookups to work reliably across all VPS providers.

Thanks for looking into this!

rodnymolina commented 3 years ago

(co-founder of Cloudron) . Let me know if you need any clarifications on the Cloudron setup. Do you know of any workaround for the DNS resolution? We have an unbound server on the host and we require the containers to resolve via that DNS server. This is required for email server dnsbl lookups to work reliably across all VPS providers.

Thanks for looking into this!

Thanks @gramakri, @vrobm has been very helpful answering my questions and providing the setup. I'm expecting to have a fix for this issue very soon.

Now, coming to the DNS issue that you eluded to, is this something that you have replicated in the setup that i've been using (with my temporary fixes), or is this a separate setup? If this is a different environment, could you please file a separate issue for this with a brief explanation?

gramakri commented 3 years ago

@rodnymolina Ah sorry, I was merely commenting on your point no. 2 in the initial bug report. I was just letting you know why our DNS setup is the way it is, nothing more :)

rodnymolina commented 3 years ago

@gramakri, i see your point.

The error that Sysbox displays above (as part of problem 2) is a direct consequence of some DNS tweaking we did a while back to avoid resolution issues in nested containers. Basically, in scenarios where docker internal DNS is utilized (i.e. whenever docker's custom-networks are used), docker pushes a "127.0.0.11" entry into the container's resolv.conf, and creates a SNAT iptable-entry to ensure that DNS queries are forwarded to the docker's embedded DNS running in the host. That works well for containers running at level-1, but it causes issues for nested containers (L2+), as these rely on Docker's default DNS server (i.e. 8.8.8.8). As a result, DNS queries initiated from L2 containers end up going to an external DNS server and not to the one running in the host.

We fixed this problem by having Sysbox writing a valid/routable address in resolv.conf (we picked the L1 container's egress iface), and by adjusting the iptables to ensure that all DNS queries are forwarded to Docker's embedded DNS server.

I'm explaining all this so that you understand that, from my perspective, your DNS setup is no special, you are just relying on Docker's embedded DNS server, and the problem is triggered simply coz we can't write into the container's resolv.conf as it's mounted as RO. (fix is WIP).

Please let me know if there's anything that i haven't fully understood from your DNS setup.

rodnymolina commented 3 years ago

Fixed by sysbox-runc PR #22. Closing now.