Open bencyq opened 1 month ago
So the only node that it's failing is the one with arm64? I'll check if the container for arm64 was rightly created. Could you check the logs of the failing pod with kubectl?
So the only node that it's failing is the one with arm64? I'll check if the container for arm64 was rightly created. Could you check the logs of the failing pod with kubectl?
Thank you for your reply. Here are the logs.
$ kubectl logs kube-flannel-ds-vjhqf -n kube-flannel
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
Error from server (BadRequest): container "kube-flannel" in pod "kube-flannel-ds-vjhqf" is waiting to start: PodInitializing
$ kubectl logs kube-proxy-xx8vt -n kube-system
failed to try resolving symlinks in path "/var/log/pods/kube-system_kube-proxy-xx8vt_7758f284-f039-4aeb-bbf5-11da20d35c8f/kube-proxy/6043.log": lstat /var/log/pods/kube-system_kube-proxy-xx8vt_7758f284-f039-4aeb-bbf5-11da20d35c8f/kube-proxy/6043.log: no such file or directory
Which is the output for
kubectl logs kube-flannel-ds-vjhqf -n kube-flannel -c install-cni-plugin
cricrl ps -a
I face a very similar looking problem: the Flannel DaemonSet
pod fails to come up.
It fails on the install-cni-plugin
init-container which gets into state CreateContainerConfigError
pretty much immediatelly.
NAME↑ PF IMAGE READY STATE INIT RESTARTS PROBES(L:R) CPU/R:L MEM/R:L PORTS AGE
install-cni ● docker.io/flannel/flannel:v0.25.6 true Completed true 0 off:off 0:0 0:0 6h22
install-cni-plugin ● docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel2 false CreateContainerConfigError true 0 off:off 0:0 0:0 6h22
kube-flannel ● docker.io/flannel/flannel:v0.25.6 false Unknown false 0 off:off 100:0 50:0 6h22
It does not give any output:
unable to retrieve container logs for containerd://f00f752cb6d46e4b2f866d5f6ec5ca3be330353121177d7480db396ecace6904
as I understand CreateContainerConfigError
no output is to be expected as the error happens before any binary/entrypoint/... from the image gets started
kubectl describe pod kube-flannell-ds-59hwh
has this error:
Warning Failed 16m (x12 over 18m) kubelet Error: services have not yet been read at least once, cannot construct envvars
This comes while upgrading Kubernetes from v1.30.3
to v1.31.0
as in "it happens with all nodes I reboot into the later version";
Looking into CHANGELOG-1.31 I am lost at what could be releated.
I somehow guess it may be related to the use of Downward API for two env:
variables, which get filled via fieldRef:
-> fieldPath: metadata.XYZ
. Thats more guessing then knowing.
I reversed my Kubernetes version from v1.31.0
to v1.30.3
and the flannel-cni-plugin:v1.5.1-flannel2
init-container succeeds making flannel:v0.25.6
startup fine as a consequence
... so I think there is clearly a problem coming from the changes with Kubernetes v1.31.0
.
Hi I tested flannel with k8s 1.31 on both amd64 and arm64 and I had no issue. My test was on Ubuntu 24.04.
Can you show the kernel version that you're using and the kernel logs?
the system is
$ uname -a
Linux kc04 6.6.43-flatcar #1 SMP PREEMPT_DYNAMIC Mon Aug 5 20:36:27 -00 2024 x86_64 Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz GenuineIntel GNU/Linux
$ cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=3975.2.0
VERSION_ID=3975.2.0
BUILD_ID=2024-08-05-2103
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 3975.2.0 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:3975.2.0:*:*:*:*:*:*:*"
to switch the K8s version I toggle the /etc/extensions/kubernetes.raw
softlink between /opt/extensions/kubernetes/kubernetes-v1.30.3-x86-64.raw
and /opt/extensions/kubernetes/kubernetes-v1.31.0-x86-64.raw
(this is Flatcars way of "blending in" software utilizing systemd-sysext with their sysext-bakery
For kernel logs I can't do that right now, as I would need to take down a node to get a clean one and they are all busy.
Note that the kernel version and OS version does not change here, I really only toggle the K8s binaries and reboot to have it all start properly after "blending".
Expected Behavior
k8s pod kube-flannel-ds-vjhqf is ready; docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel1 is functioning properly
Current Behavior
k8s pod kube-flannel-ds-vjhqf is always at state: Init:RunContainerError pod kube-flannel-ds-vjhqf failed at
Possible Solution
Steps to Reproduce (for bugs)
Context
node can never be ready in k8s cluster I'm using an arm64 machine as node to join a x86 cluster, does it matter?
Your Environment