openconfig / kne

Apache License 2.0
217 stars 65 forks source link

KinD cluster deploy issue - timeout - kindnet Init:ImagePullBackOff #345

Closed tomaszkazmierczak closed 1 year ago

tomaszkazmierczak commented 1 year ago

Any advise ?


root@ubuntu2:~/kne# kne deploy deploy/kne/kind-bridge.yaml I0330 02:04:27.956818 15078 deploy.go:141] Deploying cluster... I0330 02:04:28.136665 15078 deploy.go:404] kind version valid: got v0.17.0 want v0.17.0 I0330 02:04:28.136751 15078 deploy.go:411] Attempting to recycle existing cluster "kne"... W0330 02:04:28.785704 15078 deploy.go:52] (kubectl): error: context "kind-kne" does not exist I0330 02:04:28.790215 15078 deploy.go:436] Creating kind cluster with: [create cluster --name kne --image kindest/node:v1.26.0 --config /root/kne/kind/kind-no-cni.yaml] W0330 02:04:29.309393 15078 deploy.go:52] (kind): Creating cluster "kne" ... W0330 02:04:29.309451 15078 deploy.go:52] (kind): • Ensuring node image (kindest/node:v1.26.0) 🖼 ... W0330 02:04:39.030434 15078 deploy.go:52] (kind): ✓ Ensuring node image (kindest/node:v1.26.0) 🖼 W0330 02:04:39.030476 15078 deploy.go:52] (kind): • Preparing nodes 📦 ... W0330 02:04:54.563861 15078 deploy.go:52] (kind): ✓ Preparing nodes 📦 W0330 02:04:54.810156 15078 deploy.go:52] (kind): • Writing configuration 📜 ... W0330 02:04:55.984071 15078 deploy.go:52] (kind): ✓ Writing configuration 📜 W0330 02:04:55.984101 15078 deploy.go:52] (kind): • Starting control-plane 🕹️ ... W0330 02:05:12.591261 15078 deploy.go:52] (kind): ✓ Starting control-plane 🕹️ W0330 02:05:12.591287 15078 deploy.go:52] (kind): • Installing StorageClass 💾 ... W0330 02:05:16.025463 15078 deploy.go:52] (kind): ✓ Installing StorageClass 💾 W0330 02:05:17.530321 15078 deploy.go:52] (kind): Set kubectl context to "kind-kne" W0330 02:05:17.530343 15078 deploy.go:52] (kind): You can now use your cluster with: W0330 02:05:17.530347 15078 deploy.go:52] (kind): kubectl cluster-info --context kind-kne W0330 02:05:17.530355 15078 deploy.go:52] (kind): Thanks for using kind! 😊 I0330 02:05:17.532867 15078 deploy.go:440] Deployed kind cluster: kne I0330 02:05:17.532919 15078 deploy.go:454] Found manifest "/root/kne/manifests/kind/kind-bridge.yaml" I0330 02:05:17.945531 15078 deploy.go:49] (kubectl): clusterrole.rbac.authorization.k8s.io/kindnet created I0330 02:05:17.953701 15078 deploy.go:49] (kubectl): clusterrolebinding.rbac.authorization.k8s.io/kindnet created I0330 02:05:17.961998 15078 deploy.go:49] (kubectl): serviceaccount/kindnet created I0330 02:05:17.972441 15078 deploy.go:49] (kubectl): daemonset.apps/kindnet created I0330 02:05:17.977346 15078 deploy.go:145] Cluster deployed I0330 02:05:18.061496 15078 deploy.go:49] (kubectl): Kubernetes control plane is running at https://127.0.0.1:33603 I0330 02:05:18.061527 15078 deploy.go:49] (kubectl): CoreDNS is running at https://127.0.0.1:33603/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy I0330 02:05:18.061537 15078 deploy.go:49] (kubectl): To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. I0330 02:05:18.063741 15078 deploy.go:149] Cluster healthy I0330 02:05:18.094258 15078 deploy.go:160] Checking kubectl versions. WARNING: version difference between client (1.24) and server (1.26) exceeds the supported minor version skew of +/-1 I0330 02:05:18.192486 15078 deploy.go:192] Deploying ingress... I0330 02:05:18.205668 15078 deploy.go:696] Creating metallb namespace I0330 02:05:18.205720 15078 deploy.go:715] Deploying MetalLB from: /root/kne/manifests/metallb/manifest.yaml I0330 02:05:18.508827 15078 deploy.go:49] (kubectl): namespace/metallb-system created I0330 02:05:18.521972 15078 deploy.go:49] (kubectl): customresourcedefinition.apiextensions.k8s.io/addresspools.metallb.io created I0330 02:05:18.530773 15078 deploy.go:49] (kubectl): customresourcedefinition.apiextensions.k8s.io/bfdprofiles.metallb.io created I0330 02:05:18.544036 15078 deploy.go:49] (kubectl): customresourcedefinition.apiextensions.k8s.io/bgpadvertisements.metallb.io created I0330 02:05:18.557350 15078 deploy.go:49] (kubectl): customresourcedefinition.apiextensions.k8s.io/bgppeers.metallb.io created I0330 02:05:18.566570 15078 deploy.go:49] (kubectl): customresourcedefinition.apiextensions.k8s.io/communities.metallb.io created I0330 02:05:18.575258 15078 deploy.go:49] (kubectl): customresourcedefinition.apiextensions.k8s.io/ipaddresspools.metallb.io created I0330 02:05:18.593987 15078 deploy.go:49] (kubectl): customresourcedefinition.apiextensions.k8s.io/l2advertisements.metallb.io created I0330 02:05:18.601097 15078 deploy.go:49] (kubectl): serviceaccount/controller created I0330 02:05:18.606749 15078 deploy.go:49] (kubectl): serviceaccount/speaker created I0330 02:05:18.614287 15078 deploy.go:49] (kubectl): role.rbac.authorization.k8s.io/controller created I0330 02:05:18.622656 15078 deploy.go:49] (kubectl): role.rbac.authorization.k8s.io/pod-lister created I0330 02:05:18.634749 15078 deploy.go:49] (kubectl): clusterrole.rbac.authorization.k8s.io/metallb-system:controller created I0330 02:05:18.642580 15078 deploy.go:49] (kubectl): clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created I0330 02:05:18.652723 15078 deploy.go:49] (kubectl): rolebinding.rbac.authorization.k8s.io/controller created I0330 02:05:18.663793 15078 deploy.go:49] (kubectl): rolebinding.rbac.authorization.k8s.io/pod-lister created I0330 02:05:18.671275 15078 deploy.go:49] (kubectl): clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created I0330 02:05:18.682155 15078 deploy.go:49] (kubectl): clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created I0330 02:05:18.708048 15078 deploy.go:49] (kubectl): secret/webhook-server-cert created I0330 02:05:18.738683 15078 deploy.go:49] (kubectl): service/webhook-service created I0330 02:05:18.758494 15078 deploy.go:49] (kubectl): deployment.apps/controller created I0330 02:05:18.777772 15078 deploy.go:49] (kubectl): daemonset.apps/speaker created I0330 02:05:18.798662 15078 deploy.go:49] (kubectl): validatingwebhookconfiguration.admissionregistration.k8s.io/metallb-webhook-configuration created I0330 02:05:18.908225 15078 deploy.go:720] Creating metallb secret I0330 02:05:18.945038 15078 deploy.go:1115] Waiting on deployment "metallb-system" to be healthy ... ...

root@ubuntu2:# kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-787d4945fb-lmdgp 0/1 Pending 0 6m8s kube-system coredns-787d4945fb-m89zq 0/1 Pending 0 6m8s kube-system etcd-kne-control-plane 1/1 Running 0 6m21s kube-system kindnet-dcgtk 0/1 Init:ImagePullBackOff 0 6m8s kube-system kube-apiserver-kne-control-plane 1/1 Running 0 6m21s kube-system kube-controller-manager-kne-control-plane 1/1 Running 0 6m21s kube-system kube-proxy-rb7wg 1/1 Running 0 6m8s kube-system kube-scheduler-kne-control-plane 1/1 Running 0 6m21s local-path-storage local-path-provisioner-c8855d4bb-jss7v 0/1 Pending 0 6m8s metallb-system controller-8bb68977b-rjqc8 0/1 Pending 0 6m8s root@ubuntu2:#

tomaszkazmierczak commented 1 year ago

What is interesting is that kidnet is not able to pull image (manual docker pull works):

root@ubuntu3:~/kne# kubectl describe pod kindnet-l6fln -n kube-system <..> Events: Type Reason Age From Message


Normal Scheduled 3m34s default-scheduler Successfully assigned kube-system/kindnet-l6fln to kne-control-plane Warning Failed 2m21s kubelet Failed to pull image "aojea/kindnetd:v1.0.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/aojea/kindnetd:v1.0.1": failed to resolve reference "docker.io/aojea/kindnetd:v1.0.1": failed to do request: Head "https://registry-1.docker.io/v2/aojea/kindnetd/manifests/v1.0.1": dial tcp 34.194.164.123:443: i/o timeout Normal BackOff 48s (x5 over 3m3s) kubelet Back-off pulling image "aojea/kindnetd:v1.0.1" Warning Failed 48s (x5 over 3m3s) kubelet Error: ImagePullBackOff Normal Pulling 35s (x4 over 3m33s) kubelet Pulling image "aojea/kindnetd:v1.0.1" Warning Failed 5s (x3 over 3m3s) kubelet Failed to pull image "aojea/kindnetd:v1.0.1": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/aojea/kindnetd:v1.0.1": failed to resolve reference "docker.io/aojea/kindnetd:v1.0.1": failed to do request: Head "https://registry-1.docker.io/v2/aojea/kindnetd/manifests/v1.0.1": dial tcp 18.215.138.58:443: i/o timeout Warning Failed 5s (x4 over 3m3s) kubelet Error: ErrImagePull root@ubuntu3:~/kne#


root@ubuntu3:~/kne# docker pull aojea/kindnetd:v1.0.1 v1.0.1: Pulling from aojea/kindnetd 51b0e3ff7517: Pull complete dd22051c691b: Pull complete fad64b26f47a: Pull complete a48222ec7c99: Pull complete Digest: sha256:cb4e59bf2a28ada50ffd0b1120d223c58a344ba0694e8f8ef4140b1143bb84c4 Status: Downloaded newer image for aojea/kindnetd:v1.0.1 docker.io/aojea/kindnetd:v1.0.1 root@ubuntu3:~/kne#

root@ubuntu3:~/kne# docker image ls | grep kind kindest/node v1.26.0 6d3fbfb3da60 3 months ago 931MB kindest/node v1.24.3 bb1e33fb6934 7 months ago 922MB aojea/kindnetd v1.0.1 f19c8aef801e 9 months ago 128MB root@ubuntu3:~/kne#

root@ubuntu3:~/kne# kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-787d4945fb-7mkrz 0/1 Pending 0 7m43s kube-system coredns-787d4945fb-h28z6 0/1 Pending 0 7m43s kube-system etcd-kne-control-plane 1/1 Running 0 7m56s kube-system kindnet-l6fln 0/1 Init:ImagePullBackOff 0 7m43s kube-system kube-apiserver-kne-control-plane 1/1 Running 0 7m57s kube-system kube-controller-manager-kne-control-plane 1/1 Running 0 7m59s kube-system kube-proxy-vngb6 1/1 Running 0 7m43s kube-system kube-scheduler-kne-control-plane 1/1 Running 0 7m56s local-path-storage local-path-provisioner-c8855d4bb-d7q2f 0/1 Pending 0 7m43s metallb-system controller-8bb68977b-ng4zz 0/1 Pending 0 7m43s root@ubuntu3:~/kne#

tomaszkazmierczak commented 1 year ago

To close the loop and resolve the issue in my setup, I block the loading of iptables-related modules.


root@ubuntu3:~# date; kubectl get pods -A Fri Mar 31 04:16:30 PM UTC 2023 NAMESPACE NAME READY STATUS RESTARTS AGE arista-ceoslab-operator-system arista-ceoslab-operator-controller-manager-66cb57484f-9fvjx 2/2 Running 0 44m ixiatg-op-system ixiatg-op-controller-manager-7b5db775d9-2h7kv 2/2 Running 0 45m kube-system coredns-787d4945fb-272kd 1/1 Running 0 45m kube-system coredns-787d4945fb-bv59h 1/1 Running 0 45m kube-system etcd-kne-control-plane 1/1 Running 0 46m kube-system kindnet-dm9jb 1/1 Running 0 45m kube-system kube-apiserver-kne-control-plane 1/1 Running 0 46m kube-system kube-controller-manager-kne-control-plane 1/1 Running 0 46m kube-system kube-proxy-945xx 1/1 Running 0 45m kube-system kube-scheduler-kne-control-plane 1/1 Running 0 46m lemming-operator lemming-controller-manager-6fc9d47f7d-2ffcb 2/2 Running 0 43m local-path-storage local-path-provisioner-c8855d4bb-b5hwz 1/1 Running 0 45m meshnet meshnet-fn9br 1/1 Running 0 45m metallb-system controller-8bb68977b-jgkzp 1/1 Running 0 45m metallb-system speaker-qqgtr 1/1 Running 0 45m srlinux-controller srlinux-controller-controller-manager-5bddb8b985-96qj5 2/2 Running 0 44m root@ubuntu3:~#

root@ubuntu3:~# iptables --version iptables v1.8.7 (legacy) root@ubuntu3:~#