Closed ag-TJNII closed 6 months ago
Hmm, those apt-get update
errors sound suspiciously like seccomp failures -- any way you could get an aggressively newer version of libseccomp2
on your host and try again (or try --security-opt seccomp=unconfined
on your debian
container)?
(I'm not sure how libseccomp2
versions interact via Docker-in-Docker -- I knew at one point but the knowledge has left me. :sob:)
docker run --rm -ti --security-opt seccomp=unconfined debian:latest
behaves the same way. I'll have to set up a test bed to test upgrading host libraries, the host I'm reproducing this on is an active node so I can't fiddle too much there. I can put come cycles into that next week.
The version of libseccomp on the troubled host:
libseccomp.x86_64 2.3.1-4.el7 @centos7-x86_64-os
libseccomp.i686 2.3.1-4.el7 centos7-x86_64-os
libseccomp-devel.i686 2.3.1-4.el7 centos7-x86_64-os
libseccomp-devel.x86_64 2.3.1-4.el7 centos7-x86_64-os
Yeah, thanks for testing -- it's probably not libseccomp
then :smile:
My best guess now is that the CentOS 7 kernel supports nf_tables
, but maybe it wasn't fully/completely backported to that kernel and thus doesn't work in a network namespace or something?
Also, to be clear, it's not just DNS that is failing. I'm also seeing ICMP and TCP failures.
docker run --rm -ti jonlabelle/network-tools
[network-tools]$ curl http://142.250.191.238 -ILX GET --connect-timeout 5
curl: (28) Failed to connect to 142.250.191.238 port 80 after 5001 ms: Timeout was reached
[network-tools]$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
9 packets transmitted, 0 received, 100% packet loss, time 7999ms
We are also experiencing this problem which makes all our CI build jobs failed. Changing all tags from latest to the previous version on hundred of repos is not that ideal :(
Can you run the following one-liner on affected infrastructure and provide the full output?
docker run -it --rm --privileged docker:dind sh -euxc 'modprobe nf_tables > /dev/null 2>&1 || :; if ! iptables -nL > /dev/null 2>&1; then modprobe ip_tables || :; /usr/local/sbin/.iptables-legacy/iptables -nL > /dev/null 2>&1; echo success legacy; else echo success nftables; fi'
It should look something like this:
$ docker run -it --rm --privileged docker:dind sh -euxc 'modprobe nf_tables > /dev/null 2>&1 || :; if ! iptables -nL > /dev/null 2>&1; then modprobe ip_tables || :; /usr/local/sbin/.iptables-legacy/iptables -nL > /dev/null 2>&1; echo success legacy; else echo success nftables; fi'
+ modprobe nf_tables
+ :
+ iptables -nL
+ echo success nftables
success nftables
or:
$ docker run -it --rm --privileged docker:dind sh -euxc 'modprobe nf_tables > /dev/null 2>&1 || :; if ! false iptables -nL > /dev/null 2>&1; then modprobe ip_tables || :; /usr/local/sbin/.iptables-legacy/iptables -nL > /dev/null 2>&1; echo success legacy; else echo success nftables; fi'
+ modprobe nf_tables
+ :
+ iptables -nL
+ modprobe ip_tables
ip: can't find device 'ip_tables'
ip_tables 36864 0
x_tables 53248 8 ip_tables,xt_mark,xt_nat,xt_tcpudp,xt_conntrack,xt_MASQUERADE,xt_addrtype,nft_compat
modprobe: can't change directory to '/lib/modules': No such file or directory
+ :
+ /usr/local/sbin/.iptables-legacy/iptables -nL
+ echo success legacy
success legacy
# docker images | grep 'docker[[:space:]]\+dind[[:space:]]\+'
docker dind 6091c7bd89fd 3 days ago 331MB
# docker run -it --rm --privileged docker:dind sh -euxc 'modprobe nf_tables > /dev/null 2>&1 || :; if ! iptables -nL > /dev/null 2>&1; then modprobe ip_tables || :; /usr/local/sbin/.iptables-legacy/iptables -nL > /dev/null 2>&1; echo success legacy; else echo success nftables; fi'
+ modprobe nf_tables
+ :
+ iptables -nL
+ echo success nftables
success nftables
Any chance you could test https://github.com/docker-library/docker/pull/468? :eyes:
docker build --pull 'https://github.com/docker-library/docker.git#refs/pull/468/merge:24/dind'
Any chance you could test #468? 👀
docker build --pull 'https://github.com/docker-library/docker.git#refs/pull/468/merge:24/dind'
This did not resolve it, unfortunately. Still no network inside the inner containers.
That's absolutely flabbergasting. :sob:
I spun up my own CentOS 7 instance to try and debug further, and I managed to replicate immediately. What I've found is that the host is definitely still using the legacy iptables/xtables, and we have zero means of detecting that reliably inside the container that I've found so far. So, as far as I can tell, there's something deficient in either the network namespaces or nf_tables
implementations in that CentOS kernel.
The best I've come up with is checking whether ip_tables
is loaded and that nf_tables
is not (which means if you've run the current version of the container that's loading nf_tables
, it'll be unable to detect correctly until you reboot or unload that module). This is pretty fragile, but it's really the best I can think of.
Is this something we could allow the user to specify via a config ENV var? I also wonder how much of a concern this needs to be, as CentOS 7 goes EOL at the end of June. If this is a pain for maintenance I think a documented config setting for near end of life / past end of life setups is reasonable.
Ok, #468 is probably on hold until the new year (https://github.com/docker-library/docker/pull/468#issuecomment-1863363312), but here are some workarounds if you need to fix this before we can resolve that:
FROM docker:dind
ENV PATH /usr/local/sbin/.iptables-legacy:$PATH
or:
docker run ... --env PATH='/usr/local/sbin/.iptables-legacy:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' ...
We're seeing issues where dind images after
docker:24.0.6-dind
have issues with networking inside the inner containers. The following logs are using thedocker@sha256:8f9c4d8cdaa2f87b5269d4d6759711c843c37e34a02b8bb45653e5b8f4e2f0a2
image, which I believe should have the updates from https://github.com/docker-library/docker/issues/463 (please let me know if ti doesn't).I can reproduce our issues by launching dind with
docker run --rm -ti --privileged --name docker -e DOCKER_TLS_CERTDIR= -p 2375:2375 docker@sha256:8f9c4d8cdaa2f87b5269d4d6759711c843c37e34a02b8bb45653e5b8f4e2f0a2
I believe the
bridge
warning to be a red-herring as I see that indocker:24.0.6-dind
which works.I then run a
debian
container and try aapt-get update
Network works if I run the inner container with
--net=host
.Host info: