using the ZFS containerd snapshotter, `kubeadm join` fails because it cannot contact the API server on 127.0.0.1. Containerd snapshotter not specified for nerdctl. #11734
I added a node to a Kubernetes cluster, where /var/lib/containerd is on ZFS. To make that work with Linux kernels 5.x, one needs to use the ZFS snapshotter for containerd. Fortunately kubespray provides a containerd_snapshotter variable to customize which snapshotter to use.
Symptom 1: kubeadm join fails to connect to localhost
The first observable symptom was the deployment failing at the kubeadm join step with:
error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get config map: Get \"https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s\": dial tcp 127.0.0.1:6443: connect: connection refused
Complete trace of running kubeadm
```
# /usr/local/bin/kubeadm join --config /etc/kubernetes/kubeadm-client.conf --ignore-preflight-errors=all --skip-phases= --v=5
I1121 14:11:50.795056 345068 join.go:413] [preflight] found NodeName empty; using OS hostname as NodeName
I1121 14:11:50.795096 345068 joinconfiguration.go:76] loading configuration from "/etc/kubernetes/kubeadm-client.conf"
[preflight] Running pre-flight checks
I1121 14:11:50.795597 345068 preflight.go:93] [preflight] Running general checks
I1121 14:11:50.795624 345068 checks.go:280] validating the existence of file /etc/kubernetes/kubelet.conf
I1121 14:11:50.795631 345068 checks.go:280] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
I1121 14:11:50.795639 345068 checks.go:104] validating the container runtime
I1121 14:11:50.811470 345068 checks.go:639] validating whether swap is enabled or not
I1121 14:11:50.811498 345068 checks.go:370] validating the presence of executable crictl
I1121 14:11:50.811507 345068 checks.go:370] validating the presence of executable conntrack
I1121 14:11:50.811517 345068 checks.go:370] validating the presence of executable ip
I1121 14:11:50.811528 345068 checks.go:370] validating the presence of executable iptables
I1121 14:11:50.811540 345068 checks.go:370] validating the presence of executable mount
I1121 14:11:50.811550 345068 checks.go:370] validating the presence of executable nsenter
I1121 14:11:50.811560 345068 checks.go:370] validating the presence of executable ethtool
I1121 14:11:50.811569 345068 checks.go:370] validating the presence of executable tc
I1121 14:11:50.811577 345068 checks.go:370] validating the presence of executable touch
I1121 14:11:50.811588 345068 checks.go:516] running all checks
I1121 14:11:50.819640 345068 checks.go:401] checking whether the given node name is valid and reachable using net.LookupHost
I1121 14:11:50.819810 345068 checks.go:605] validating kubelet version
I1121 14:11:50.848860 345068 checks.go:130] validating if the "kubelet" service is enabled and active
I1121 14:11:50.855725 345068 checks.go:203] validating availability of port 10250
I1121 14:11:50.855841 345068 checks.go:280] validating the existence of file /etc/kubernetes/ssl/ca.crt
I1121 14:11:50.855849 345068 checks.go:430] validating if the connectivity type is via proxy or direct
I1121 14:11:50.855871 345068 checks.go:329] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I1121 14:11:50.855888 345068 checks.go:329] validating the contents of file /proc/sys/net/ipv4/ip_forward
I1121 14:11:50.855902 345068 join.go:532] [preflight] Discovering cluster-info
I1121 14:11:50.855928 345068 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "10.15.45.57:6443"
I1121 14:11:50.861147 345068 token.go:118] [discovery] Requesting info from "10.15.45.57:6443" again to validate TLS against the pinned public key
I1121 14:11:50.865278 345068 token.go:135] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.15.45.57:6443"
I1121 14:11:50.865290 345068 discovery.go:52] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process
I1121 14:11:50.865299 345068 join.go:546] [preflight] Fetching init configuration
I1121 14:11:50.865304 345068 join.go:592] [preflight] Retrieving KubeConfig objects
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
Get "https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": dial tcp 127.0.0.1:6443: connect: connection refused
failed to get config map
```
The reason for this is that there's no nginx static pod running on localhost, redirecting API server calls to the actual API server. But why would there be, if kubelet hasn't been set up yet? systemctl start kubelet made it error with /etc/kubernetes/bootstrap-kubelet.conf, which makes sense because it's created by kubeadm.
Symptom 2: all the container images that were supposed to be downloaded had failed
I tried to set up a nginx proxy container manually with ctr, and got the following error:
ctr: failed to prepare extraction snapshot extract-314484180-5a4G sha256:82779bba718f2fd100a0d5e3020316a466fb44af95aee66dcdfdc6c07786be73: failed to create snapshot: missing parent k8s.io/98/sha256:cc2447e1835a40530975ab80bb1f872fbab0f2a0faecf2ab16fbbb89b3589438 bucket: not found
the reason is that the images had been pulled without specifying the snapshotter, and ctr does not read /etc/containerd/config.toml to decide which snapshotter to use. So containerd had been using overlayfs and that fails on ZFS with linux 5.x.
Symptom 3: kubelet still stuck, missing ca.crt
Even after making sure /etc/containerd/config.toml has ZFS specified and the correct images were downloaded with the ZFS snapshotter, checking systemctl status kubectl after kubeadm fails to contact localhost shows this error:
Nov 21 18:26:34 node04 kubelet[130647]: E1121 18:26:34.780780 130647 run.go:74] "command failed" err="failed to construct kubelet dependencies: unable to load client CA file /etc/kubernetes/ssl/ca.crt: open /etc/kubernetes/ssl/ca.crt: no such file or directory
Is there something missing in the bootstrapping here? What is wrong?
Summary of changes I made to fix this:
add --snapshotter={{ containerd_snapshotter }} to nerdctl_image_pull_command in roles/kubespray-defaults/defaults/main/download.yml
use containerd_snapshotter instead of the nonexistent nerdctl_snapshotter in roles/container-engine/nerdctl/templates/nerdctl.toml.j2
run a NGINX proxy container manually so that kubeadm can bootstrap. this worked, but without adding a manual proxy container kubeadm cannot bootstrap
What did you expect to happen?
I expected the node to join to the cluster without problems.
How can we reproduce it (as minimally and precisely as possible)?
Deploy a cluster on a ZFS filesystem, setting containerd_snapshotter=zfs.
I'll provide this if relevant -- it's not easy for me to recapture now.
Anything else we need to know
I apologize for not providing a reproduction and hope this is enough info. At the very least I hope you'll appreciate the surefire bug that is using the nerdctl_snapshotter variable which does not exist.
What happened?
I added a node to a Kubernetes cluster, where
/var/lib/containerd
is on ZFS. To make that work with Linux kernels 5.x, one needs to use the ZFS snapshotter for containerd. Fortunately kubespray provides acontainerd_snapshotter
variable to customize which snapshotter to use.Note: I solved this issue for myself in a somewhat hacky way. I don't think it should be upstreamed as-is, but will help you get an idea of what it is: https://github.com/kubernetes-sigs/kubespray/compare/release-2.25...AlignmentResearch:kubespray:far/zfs-gpu-fixes?expand=1
Symptom 1: kubeadm join fails to connect to localhost
The first observable symptom was the deployment failing at the
kubeadm join
step with:Complete trace of running kubeadm
``` # /usr/local/bin/kubeadm join --config /etc/kubernetes/kubeadm-client.conf --ignore-preflight-errors=all --skip-phases= --v=5 I1121 14:11:50.795056 345068 join.go:413] [preflight] found NodeName empty; using OS hostname as NodeName I1121 14:11:50.795096 345068 joinconfiguration.go:76] loading configuration from "/etc/kubernetes/kubeadm-client.conf" [preflight] Running pre-flight checks I1121 14:11:50.795597 345068 preflight.go:93] [preflight] Running general checks I1121 14:11:50.795624 345068 checks.go:280] validating the existence of file /etc/kubernetes/kubelet.conf I1121 14:11:50.795631 345068 checks.go:280] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf I1121 14:11:50.795639 345068 checks.go:104] validating the container runtime I1121 14:11:50.811470 345068 checks.go:639] validating whether swap is enabled or not I1121 14:11:50.811498 345068 checks.go:370] validating the presence of executable crictl I1121 14:11:50.811507 345068 checks.go:370] validating the presence of executable conntrack I1121 14:11:50.811517 345068 checks.go:370] validating the presence of executable ip I1121 14:11:50.811528 345068 checks.go:370] validating the presence of executable iptables I1121 14:11:50.811540 345068 checks.go:370] validating the presence of executable mount I1121 14:11:50.811550 345068 checks.go:370] validating the presence of executable nsenter I1121 14:11:50.811560 345068 checks.go:370] validating the presence of executable ethtool I1121 14:11:50.811569 345068 checks.go:370] validating the presence of executable tc I1121 14:11:50.811577 345068 checks.go:370] validating the presence of executable touch I1121 14:11:50.811588 345068 checks.go:516] running all checks I1121 14:11:50.819640 345068 checks.go:401] checking whether the given node name is valid and reachable using net.LookupHost I1121 14:11:50.819810 345068 checks.go:605] validating kubelet version I1121 14:11:50.848860 345068 checks.go:130] validating if the "kubelet" service is enabled and active I1121 14:11:50.855725 345068 checks.go:203] validating availability of port 10250 I1121 14:11:50.855841 345068 checks.go:280] validating the existence of file /etc/kubernetes/ssl/ca.crt I1121 14:11:50.855849 345068 checks.go:430] validating if the connectivity type is via proxy or direct I1121 14:11:50.855871 345068 checks.go:329] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables I1121 14:11:50.855888 345068 checks.go:329] validating the contents of file /proc/sys/net/ipv4/ip_forward I1121 14:11:50.855902 345068 join.go:532] [preflight] Discovering cluster-info I1121 14:11:50.855928 345068 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "10.15.45.57:6443" I1121 14:11:50.861147 345068 token.go:118] [discovery] Requesting info from "10.15.45.57:6443" again to validate TLS against the pinned public key I1121 14:11:50.865278 345068 token.go:135] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.15.45.57:6443" I1121 14:11:50.865290 345068 discovery.go:52] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process I1121 14:11:50.865299 345068 join.go:546] [preflight] Fetching init configuration I1121 14:11:50.865304 345068 join.go:592] [preflight] Retrieving KubeConfig objects [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' Get "https://127.0.0.1:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": dial tcp 127.0.0.1:6443: connect: connection refused failed to get config map ```The reason for this is that there's no
nginx
static pod running on localhost, redirecting API server calls to the actual API server. But why would there be, if kubelet hasn't been set up yet?systemctl start kubelet
made it error with/etc/kubernetes/bootstrap-kubelet.conf
, which makes sense because it's created by kubeadm.Symptom 2: all the container images that were supposed to be downloaded had failed
I tried to set up a nginx proxy container manually with
ctr
, and got the following error:the reason is that the images had been pulled without specifying the snapshotter, and ctr does not read
/etc/containerd/config.toml
to decide which snapshotter to use. So containerd had been usingoverlayfs
and that fails on ZFS with linux 5.x.Symptom 3: kubelet still stuck, missing
ca.crt
Even after making sure
/etc/containerd/config.toml
has ZFS specified and the correct images were downloaded with the ZFS snapshotter, checkingsystemctl status kubectl
after kubeadm fails to contact localhost shows this error:Is there something missing in the bootstrapping here? What is wrong?
Summary of changes I made to fix this:
--snapshotter={{ containerd_snapshotter }}
tonerdctl_image_pull_command
inroles/kubespray-defaults/defaults/main/download.yml
containerd_snapshotter
instead of the nonexistentnerdctl_snapshotter
inroles/container-engine/nerdctl/templates/nerdctl.toml.j2
What did you expect to happen?
I expected the node to join to the cluster without problems.
How can we reproduce it (as minimally and precisely as possible)?
Deploy a cluster on a ZFS filesystem, setting
containerd_snapshotter=zfs
.OS
Linux 5.15.0-126-generic x86_64 PRETTY_NAME="Ubuntu 22.04.5 LTS" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.5 LTS (Jammy Jellyfish)" VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
Version of Ansible
ansible [core 2.16.13] config file = None configured module search path = ['/Users/adria/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] ansible python module location = /Users/adria/.pyenv/versions/3.10.13/envs/fl-3.10/lib/python3.10/site-packages/ansible ansible collection location = /Users/adria/.ansible/collections:/usr/share/ansible/collections executable location = /Users/adria/.pyenv/versions/fl-3.10/bin/ansible python version = 3.10.13 (main, Mar 23 2024, 16:18:41) [Clang 15.0.0 (clang-1500.1.0.2.5)] (/Users/adria/.pyenv/versions/3.10.13/envs/fl-3.10/bin/python) jinja version = 3.1.4 libyaml = True
(yes, I'm running this from a Mac host, that is definitely not part of the cluster and connects to all the Linuxes)
Version of Python
Python 3.10.13
Version of Kubespray (commit)
586ba66b7
Network plugin used
calico
Full inventory with variables
Too much sensitive stuff -- I'll provide if really necessary
Command used to invoke ansible
ansible-playbook -v scale.yaml -i ../inventory/cluster/hosts.yaml --become --become-user=root \ --extra-vars="@../inventory/cluster/group_vars/hardening.yaml" \ --extra-vars="ansible_ssh_private_key_file=${SSH_PRIVKEY}"
Output of ansible run
I'll provide this if relevant -- it's not easy for me to recapture now.
Anything else we need to know
I apologize for not providing a reproduction and hope this is enough info. At the very least I hope you'll appreciate the surefire bug that is using the
nerdctl_snapshotter
variable which does not exist.