Closed tianon closed 6 months ago
I was asked to test this on GitLab's infrastructure in https://github.com/docker-library/docker/issues/463#issuecomment-1861821236 below are the test results which are successful 🎉 :
# Spin up server with our imae
$ gcloud compute instances create docker-test-cos-85 --image cos-85-13310-1498-7 --image-project cos-cloud --zone=us-east1-c
# Build image
$ gcloud compute ssh docker-test-cos-85
$ docker build --pull 'https://github.com/docker-library/docker.git#refs/pull/468/merge:24/dind'
# Start Docker image
steve@docker-test-cos-85 ~ $ docker run --privileged --rm -it 10bb05416255
Certificate request self-signature ok
subject=CN = docker:dind server
/certs/server/cert.pem: OK
Certificate request self-signature ok
subject=CN = docker:dind client
/certs/client/cert.pem: OK
ip: can't find device 'nf_tables'
modprobe: can't change directory to '/lib/modules': No such file or directory
ip: can't find device 'ip_tables'
modprobe: can't change directory to '/lib/modules': No such file or directory
INFO[2023-12-19T10:25:43.763856164Z] Starting up
...
INFO[2023-12-19T10:25:43.850111407Z] containerd successfully booted in 0.054331s
INFO[2023-12-19T10:25:43.892042049Z] Loading containers: start.
INFO[2023-12-19T10:25:43.978291484Z] Loading containers: done.
INFO[2023-12-19T10:25:43.991579126Z] Docker daemon commit=311b9ff graphdriver=overlay2 version=24.0.7
INFO[2023-12-19T10:25:43.992161922Z] Daemon has completed initialization
INFO[2023-12-19T10:25:44.036184473Z] API listen on /var/run/docker.sock
INFO[2023-12-19T10:25:44.036664068Z] API listen on [::]:2376
Extremely appreciated, @stevexuereb :bow: (it's always stressful when we break lots of people at once :sweat_smile:)
I thought about adding the following as well, but discussing with @ag-TJNII in #466 came up with the idea of an explicit configuration flag for getting legacy instead, since checking loaded modules is really fragile (what if the module was loaded a while ago and never used? what if one module is built-in in the kernel config, but the other isn't? etc etc etc)
diff --git a/dockerd-entrypoint.sh b/dockerd-entrypoint.sh
index e610cca..0d3a581 100755
--- a/dockerd-entrypoint.sh
+++ b/dockerd-entrypoint.sh
@@ -156,6 +156,10 @@ if [ "$1" = 'dockerd' ]; then
# https://git.netfilter.org/iptables/tree/iptables/nft-shared.c?id=f5cf76626d95d2c491a80288bccc160c53b44e88#n420
# if we already have any "legacy" iptables rules, we should always use legacy (https://github.com/docker-library/docker/pull/468#discussion_r1430804593)
iptablesLegacy=1
+ elif grep -qE '^ip_tables ' /proc/modules && ! grep -qE '^nf_tables ' /proc/modules; then
+ # if the "ip_tables" module is loaded but the "nf_tables" module is not, we should probably use legacy (to match the host)
+ # in theory, this helps with broken implementations like CentOS 7 which *has* the "nf_tables" module but it appears to not work properly inside a network namespace (https://github.com/docker-library/docker/issues/466)
+ iptablesLegacy=1
elif ! iptables -nL > /dev/null 2>&1; then
# if iptables fails to run, chances are high the necessary kernel modules aren't loaded (perhaps the host is using xtables, for example)
# https://github.com/docker-library/docker/issues/350
Since the current implementation amounts to effectively "always use legacy" (matching the previous Alpine 3.18 image behavior), I also think it might be prudent at this point in December to wait until the new year to move further on this (in the interest of being respectful to folks' holidays if we do manage to regress again with this).
I repeated @stevexuereb's test with Google COS 105. It works with --env DOCKER_IPTABLES_LEGACY=1
, but fails without it, as explained in https://github.com/docker-library/docker/issues/467#issuecomment-1868218610.
I spent a bunch of time today trying to find a way to detect CONFIG_NF_TABLES
being loaded/built-in on the kernel reliably (ie, even if it's set to =Y
and thus won't show up in /proc/modules
), and came up empty handed. :disappointed:
The more I mess with this, the more I'm of the opinion that we should follow the distros/kernel (for example, Debian 10 / Buster is when Debian switched to nftables
, and Ubuntu was definitely switched in 20.04) and just default to nftables
with the manual (and temporary!) legacy escape hatch. At most maybe (correctly) implementing the "if legacy rules seem to exist, use legacy" (https://github.com/docker-library/docker/pull/468#discussion_r1431778320), since that's a strong signal that we should use legacy.
What I don't want to do is give the appearance that anything but nftables
is actually well-supported, because we don't really know (or have any control over) how long "legacy" will still be available/usable/possible for us to support (the legacy wrappers being removed entirely from Alpine, for example).
Edit: oh. that's basically what I've implemented here, but fixing the -s
tests. it's been a few weeks now, my brain's slipping :sob:
With this latest push, I successfully get iptables v1.8.10 (legacy)
on CentOS 7 (even with nf_tables
loaded and supposedly working) and iptables v1.8.10 (nf_tables)
on my Debian host.
Anyone feel like testing this one more time before we merge? :eyes: (ie, last call! :sweat_smile:)
Tested the latest changes with Google Container OS:
--env DOCKER_IPTABLES_LEGACY=1
, but otherwise ✅ Thanks @stanhu!! It's really appreciated :bow:
Tested the latest changes with Google Container OS:
- COS 85: ✅
- COS 105 needs
--env DOCKER_IPTABLES_LEGACY=1
, but otherwise ✅
Some more background information for documentation only because the internet search leads to this issue.
It is not an issue with COS 105 in general but with build ID 17412.226.68 and below. It already includes the kernel module nf_tables
which is detected and used by the docker startup scripts. However, this kernel module is not functional yet and, thus, the startup scripts fail.
With build ID 17412.294.10 first, the kernel module nf_tables
has officially been announced and is functional.
As a result, our GKE 1.26.6-gke.1700 (uses cos-101-17162-210-48) was working before the automatic update in the REGULAR release channel but was not after the upgrade to 1.27.8-gke.1067004 (uses cos-105-17412-226-62). We had to manually upgrade to the RAPID release channel (1.29.1-gke.1589017 with cos-109-17800-66-78) or pin the version to 1.27.11-gke.1062000 (uses cos-105-17412-294-29).
The nature of
modprobe
in this image is that it works viaip
hacks, but the exit code will always be non-zero because we don't have/lib/modules
from the host.The effect of this was that everyone was using
iptables-legacy
(whether it was warranted for them to be doing so or not).This probably fixes #466 This probably also fixes #467