Open fakoe opened 1 year ago
Please provide logs from kudeadm
This is the full stacktrace of the initalization:
I0413 10:43:42.502238 4864 interface.go:432] Looking for default routes with IPv4 addresses
I0413 10:43:42.502267 4864 interface.go:437] Default route transits interface "ens18"
I0413 10:43:42.502354 4864 interface.go:209] Interface ens18 is up
I0413 10:43:42.502393 4864 interface.go:257] Interface "ens18" has 2 addresses :[10.99.132.26/22 fe80::c838:56ff:fe4e:c487/64].
I0413 10:43:42.502406 4864 interface.go:224] Checking addr 10.99.132.26/22.
I0413 10:43:42.502411 4864 interface.go:231] IP found 10.99.132.26
I0413 10:43:42.502431 4864 interface.go:263] Found valid IPv4 address 10.99.132.26 for interface "ens18".
I0413 10:43:42.502439 4864 interface.go:443] Found active IP 10.99.132.26
I0413 10:43:42.502455 4864 kubelet.go:196] the value of KubeletConfiguration.cgroupDriver is empty; setting it to "systemd"
I0413 10:43:42.507288 4864 version.go:187] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.txt
[init] Using Kubernetes version: v1.27.0
[preflight] Running pre-flight checks
I0413 10:43:42.945684 4864 checks.go:563] validating Kubernetes and kubeadm version
I0413 10:43:42.945718 4864 checks.go:168] validating if the firewall is enabled and active
I0413 10:43:42.951622 4864 checks.go:203] validating availability of port 6443
I0413 10:43:42.951767 4864 checks.go:203] validating availability of port 10259
I0413 10:43:42.951793 4864 checks.go:203] validating availability of port 10257
I0413 10:43:42.951817 4864 checks.go:280] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
I0413 10:43:42.966786 4864 checks.go:280] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
I0413 10:43:42.966806 4864 checks.go:280] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
I0413 10:43:42.966813 4864 checks.go:280] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0413 10:43:42.966822 4864 checks.go:430] validating if the connectivity type is via proxy or direct
I0413 10:43:42.966837 4864 checks.go:469] validating http connectivity to first IP address in the CIDR
I0413 10:43:42.966851 4864 checks.go:469] validating http connectivity to first IP address in the CIDR
I0413 10:43:42.966861 4864 checks.go:104] validating the container runtime
I0413 10:43:43.780194 4864 checks.go:639] validating whether swap is enabled or not
I0413 10:43:43.780262 4864 checks.go:370] validating the presence of executable crictl
I0413 10:43:43.780283 4864 checks.go:370] validating the presence of executable conntrack
I0413 10:43:43.780292 4864 checks.go:370] validating the presence of executable ip
I0413 10:43:43.780304 4864 checks.go:370] validating the presence of executable iptables
I0413 10:43:43.780317 4864 checks.go:370] validating the presence of executable mount
I0413 10:43:43.780329 4864 checks.go:370] validating the presence of executable nsenter
I0413 10:43:43.780340 4864 checks.go:370] validating the presence of executable ebtables
I0413 10:43:43.780386 4864 checks.go:370] validating the presence of executable ethtool
I0413 10:43:43.780412 4864 checks.go:370] validating the presence of executable socat
I0413 10:43:43.780453 4864 checks.go:370] validating the presence of executable tc
I0413 10:43:43.780492 4864 checks.go:370] validating the presence of executable touch
I0413 10:43:43.780506 4864 checks.go:516] running all checks
I0413 10:43:43.798457 4864 checks.go:401] checking whether the given node name is valid and reachable using net.LookupHost
I0413 10:43:43.798479 4864 checks.go:605] validating kubelet version
I0413 10:43:43.838385 4864 checks.go:130] validating if the "kubelet" service is enabled and active
I0413 10:43:43.882023 4864 checks.go:203] validating availability of port 10250
I0413 10:43:43.882083 4864 checks.go:329] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0413 10:43:43.882131 4864 checks.go:329] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0413 10:43:43.882149 4864 checks.go:203] validating availability of port 2379
I0413 10:43:43.882167 4864 checks.go:203] validating availability of port 2380
I0413 10:43:43.882183 4864 checks.go:243] validating the existence and emptiness of directory /var/lib/etcd
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0413 10:43:43.882420 4864 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.0, falling back to the nearest etcd version (3.5.7-0)
I0413 10:43:43.882435 4864 checks.go:828] using image pull policy: IfNotPresent
I0413 10:43:43.950959 4864 checks.go:854] pulling: registry.k8s.io/kube-apiserver:v1.27.0
I0413 10:43:51.632545 4864 checks.go:854] pulling: registry.k8s.io/kube-controller-manager:v1.27.0
I0413 10:43:58.612257 4864 checks.go:854] pulling: registry.k8s.io/kube-scheduler:v1.27.0
I0413 10:44:00.510757 4864 checks.go:854] pulling: registry.k8s.io/kube-proxy:v1.27.0
I0413 10:44:03.587274 4864 checks.go:833] failed to detect the sandbox image for local container runtime, output: time="2023-04-13T10:44:03Z" level=fatal msg="getting status of runtime: failed to template data: template: tmplExecuteRawJSON:1:9: executing \"tmplExecuteRawJSON\" at <.config.sandboxImage>: map has no entry for key \"config\""
, error: exit status 1
I0413 10:44:03.611804 4864 checks.go:854] pulling: registry.k8s.io/pause:3.9
I0413 10:44:04.728341 4864 checks.go:854] pulling: registry.k8s.io/etcd:3.5.7-0
I0413 10:44:21.136428 4864 checks.go:854] pulling: registry.k8s.io/coredns/coredns:v1.10.1
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0413 10:44:22.906828 4864 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0413 10:44:23.120082 4864 certs.go:519] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local kubernetes2023master] and IPs [10.96.0.1 10.99.132.26]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0413 10:44:23.392790 4864 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I0413 10:44:23.541900 4864 certs.go:519] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I0413 10:44:23.599912 4864 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I0413 10:44:23.746892 4864 certs.go:519] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kubernetes2023master localhost] and IPs [10.99.132.26 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kubernetes2023master localhost] and IPs [10.99.132.26 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0413 10:44:24.415989 4864 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0413 10:44:24.541200 4864 kubeconfig.go:103] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0413 10:44:24.876449 4864 kubeconfig.go:103] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0413 10:44:25.073252 4864 kubeconfig.go:103] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0413 10:44:25.384124 4864 kubeconfig.go:103] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
I0413 10:44:25.732113 4864 kubelet.go:67] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0413 10:44:26.094621 4864 manifests.go:99] [control-plane] getting StaticPodSpecs
I0413 10:44:26.095001 4864 certs.go:519] validating certificate period for CA certificate
I0413 10:44:26.095068 4864 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0413 10:44:26.095076 4864 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0413 10:44:26.095081 4864 manifests.go:125] [control-plane] adding volume "etc-pki" for component "kube-apiserver"
I0413 10:44:26.095087 4864 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0413 10:44:26.095091 4864 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0413 10:44:26.095099 4864 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
I0413 10:44:26.109452 4864 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0413 10:44:26.109480 4864 manifests.go:99] [control-plane] getting StaticPodSpecs
I0413 10:44:26.109708 4864 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I0413 10:44:26.109722 4864 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I0413 10:44:26.109730 4864 manifests.go:125] [control-plane] adding volume "etc-pki" for component "kube-controller-manager"
I0413 10:44:26.109736 4864 manifests.go:125] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I0413 10:44:26.109742 4864 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I0413 10:44:26.109748 4864 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I0413 10:44:26.109753 4864 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I0413 10:44:26.109757 4864 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
I0413 10:44:26.110386 4864 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0413 10:44:26.110401 4864 manifests.go:99] [control-plane] getting StaticPodSpecs
I0413 10:44:26.110578 4864 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I0413 10:44:26.110925 4864 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
W0413 10:44:26.111084 4864 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.0, falling back to the nearest etcd version (3.5.7-0)
I0413 10:44:26.126934 4864 local.go:65] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I0413 10:44:26.126961 4864 waitcontrolplane.go:83] [wait-control-plane] Waiting for the API server to be healthy
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:259
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:1040
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:968
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1598
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:260
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:1040
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:968
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1598
This is an excerpt of the kubectl's journal:
Apr 13 10:51:48 kubernetes2023master kubelet[5454]: E0413 10:51:48.150851 5454 event.go:289] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"kubernetes2023master.175578a0835c3468", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"kubernetes2023master", UID:"kubernetes2023master", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node kubernetes2023master status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"kubernetes2023master"}, FirstTimestamp:time.Date(2023, time.April, 13, 10, 44, 26, 686706792, time.Local), LastTimestamp:time.Date(2023, time.April, 13, 10, 44, 26, 686706792, time.Local), Count:1, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Post "https://10.99.132.26:6443/api/v1/namespaces/default/events": dial tcp 10.99.132.26:6443: connect: connection refused'(may retry after sleeping)
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: E0413 10:51:49.312244 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-scheduler\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-scheduler pod=kube-scheduler-kubernetes2023master_kube-system(01b93443896f05f35e58e6528727fbe2)\"" pod="kube-system/kube-scheduler-kubernetes2023master" podUID=01b93443896f05f35e58e6528727fbe2
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: I0413 10:51:49.326705 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="629459d15997d7fc643cb6f8e57c14dfa721fe1125c2ff5a855e9cfd61c22ccb"
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: I0413 10:51:49.326860 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="db323c83722bc9712690892034fe31c77f70dbaa990daa9283a8ce8458dbba75"
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: E0413 10:51:49.370542 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=etcd pod=etcd-kubernetes2023master_kube-system(9fbcb229ba4f4d1422a77bb7e8754b2f)\"" pod="kube-system/etcd-kubernetes2023master" podUID=9fbcb229ba4f4d1422a77bb7e8754b2f
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: E0413 10:51:49.383414 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CrashLoopBackOff: \"back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-kubernetes2023master_kube-system(50d99ddb3cabe509efcae42e576fd419)\"" pod="kube-system/kube-apiserver-kubernetes2023master" podUID=50d99ddb3cabe509efcae42e576fd419
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: E0413 10:51:49.384218 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-controller-manager\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kubernetes2023master_kube-system(bf3cd71cea8fd4cc0dfa876ed8267d72)\"" pod="kube-system/kube-controller-manager-kubernetes2023master" podUID=bf3cd71cea8fd4cc0dfa876ed8267d72
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: I0413 10:51:49.397667 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="3d0aff0a8096208ea52a3896aa8fece15fa1ee0d1385055d325eb9757eaadb8e"
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: I0413 10:51:49.397703 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="b11ba4391ffd87089266a5e285e2a8dbe47385942f0fdcb0c630d02daef50269"
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: I0413 10:51:49.413685 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="a5e702a933897caa2004a80eeb29c29fbeb634dc98e9d511d04540a2bc75c20b"
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: I0413 10:51:49.413705 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="a2504213f90d01b529caff023797915d6905e0995c483412d5a89cae65b02fed"
Apr 13 10:51:49 kubernetes2023master kubelet[5454]: I0413 10:51:49.432689 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="7da46526a0eb1e721778c8d9bc5f558949c3a80e18b74caa15b25e6e3ce3efa9"
Apr 13 10:51:50 kubernetes2023master kubelet[5454]: I0413 10:51:50.464176 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="7d247eb6bf4d14b0fb3f8dee1e542db82ef01bc1bba138a1da1dc2e0442e8eeb"
Apr 13 10:51:50 kubernetes2023master kubelet[5454]: I0413 10:51:50.762097 5454 kubelet_node_status.go:70] "Attempting to register node" node="kubernetes2023master"
Apr 13 10:51:50 kubernetes2023master kubelet[5454]: E0413 10:51:50.762393 5454 kubelet_node_status.go:92] "Unable to register node with API server" err="Post \"https://10.99.132.26:6443/api/v1/nodes\": dial tcp 10.99.132.26:6443: connect: connection refused" node="kubernetes2023master"
Apr 13 10:51:50 kubernetes2023master kubelet[5454]: E0413 10:51:50.925853 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=etcd pod=etcd-kubernetes2023master_kube-system(9fbcb229ba4f4d1422a77bb7e8754b2f)\"" pod="kube-system/etcd-kubernetes2023master" podUID=9fbcb229ba4f4d1422a77bb7e8754b2f
Apr 13 10:51:51 kubernetes2023master kubelet[5454]: E0413 10:51:51.513091 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CrashLoopBackOff: \"back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-kubernetes2023master_kube-system(50d99ddb3cabe509efcae42e576fd419)\"" pod="kube-system/kube-apiserver-kubernetes2023master" podUID=50d99ddb3cabe509efcae42e576fd419
Apr 13 10:51:51 kubernetes2023master kubelet[5454]: I0413 10:51:51.530365 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="db323c83722bc9712690892034fe31c77f70dbaa990daa9283a8ce8458dbba75"
Apr 13 10:51:51 kubernetes2023master kubelet[5454]: E0413 10:51:51.649786 5454 controller.go:146] "Failed to ensure lease exists, will retry" err="Get \"https://10.99.132.26:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/kubernetes2023master?timeout=10s\": dial tcp 10.99.132.26:6443: connect: connection refused" interval="7s"
Apr 13 10:51:51 kubernetes2023master kubelet[5454]: E0413 10:51:51.752916 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-controller-manager\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kubernetes2023master_kube-system(bf3cd71cea8fd4cc0dfa876ed8267d72)\"" pod="kube-system/kube-controller-manager-kubernetes2023master" podUID=bf3cd71cea8fd4cc0dfa876ed8267d72
Apr 13 10:51:51 kubernetes2023master kubelet[5454]: E0413 10:51:51.757166 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-scheduler\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-scheduler pod=kube-scheduler-kubernetes2023master_kube-system(01b93443896f05f35e58e6528727fbe2)\"" pod="kube-system/kube-scheduler-kubernetes2023master" podUID=01b93443896f05f35e58e6528727fbe2
Apr 13 10:51:51 kubernetes2023master kubelet[5454]: I0413 10:51:51.765867 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="b11ba4391ffd87089266a5e285e2a8dbe47385942f0fdcb0c630d02daef50269"
Apr 13 10:51:51 kubernetes2023master kubelet[5454]: I0413 10:51:51.765888 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="40fdb0abb3c2a90d7e7d8b928b66dec3e9a3bd1b6c5edbf13178903c604cc543"
Apr 13 10:51:51 kubernetes2023master kubelet[5454]: I0413 10:51:51.803063 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="5c01a66c140f5d2c094860d7a45755563c73d8c6ccb06fb4df86085972252b7a"
Apr 13 10:51:52 kubernetes2023master kubelet[5454]: I0413 10:51:52.868081 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="c896a3cc7073b3ea653c8050fc4f2db3af5958cdd8ea5d81a59eb88719309c53"
Apr 13 10:51:53 kubernetes2023master kubelet[5454]: E0413 10:51:53.678918 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CrashLoopBackOff: \"back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-kubernetes2023master_kube-system(50d99ddb3cabe509efcae42e576fd419)\"" pod="kube-system/kube-apiserver-kubernetes2023master" podUID=50d99ddb3cabe509efcae42e576fd419
Apr 13 10:51:53 kubernetes2023master kubelet[5454]: E0413 10:51:53.797516 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=etcd pod=etcd-kubernetes2023master_kube-system(9fbcb229ba4f4d1422a77bb7e8754b2f)\"" pod="kube-system/etcd-kubernetes2023master" podUID=9fbcb229ba4f4d1422a77bb7e8754b2f
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: W0413 10:51:54.387840 5454 reflector.go:533] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Node: Get "https://10.99.132.26:6443/api/v1/nodes?fieldSelector=metadata.name%3Dkubernetes2023master&limit=500&resourceVersion=0": dial tcp 10.99.132.26:6443: connect: connection refused
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: E0413 10:51:54.387909 5454 reflector.go:148] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://10.99.132.26:6443/api/v1/nodes?fieldSelector=metadata.name%3Dkubernetes2023master&limit=500&resourceVersion=0": dial tcp 10.99.132.26:6443: connect: connection refused
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: E0413 10:51:54.786090 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-controller-manager\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kubernetes2023master_kube-system(bf3cd71cea8fd4cc0dfa876ed8267d72)\"" pod="kube-system/kube-controller-manager-kubernetes2023master" podUID=bf3cd71cea8fd4cc0dfa876ed8267d72
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: E0413 10:51:54.797220 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-scheduler\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-scheduler pod=kube-scheduler-kubernetes2023master_kube-system(01b93443896f05f35e58e6528727fbe2)\"" pod="kube-system/kube-scheduler-kubernetes2023master" podUID=01b93443896f05f35e58e6528727fbe2
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: E0413 10:51:54.806466 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=etcd pod=etcd-kubernetes2023master_kube-system(9fbcb229ba4f4d1422a77bb7e8754b2f)\"" pod="kube-system/etcd-kubernetes2023master" podUID=9fbcb229ba4f4d1422a77bb7e8754b2f
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: E0413 10:51:54.810932 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CrashLoopBackOff: \"back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-kubernetes2023master_kube-system(50d99ddb3cabe509efcae42e576fd419)\"" pod="kube-system/kube-apiserver-kubernetes2023master" podUID=50d99ddb3cabe509efcae42e576fd419
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: I0413 10:51:54.951435 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="6d458255ba187a8dc60b552bdbed40432116411e1e27caee72468fa76e407615"
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: I0413 10:51:54.971965 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="5444aadb15b531aa24a1a2dcc09f12218cb3b709a8ccaee7b10c3d8ec5cead7d"
Apr 13 10:51:54 kubernetes2023master kubelet[5454]: I0413 10:51:54.997265 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="40fdb0abb3c2a90d7e7d8b928b66dec3e9a3bd1b6c5edbf13178903c604cc543"
Apr 13 10:51:55 kubernetes2023master kubelet[5454]: I0413 10:51:55.019924 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="51b0e82a2952d972ea6226a0bb8e0f2517b99b02ab12e391b66e1ac90c2d5967"
Apr 13 10:51:56 kubernetes2023master kubelet[5454]: E0413 10:51:56.537583 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-scheduler\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-scheduler pod=kube-scheduler-kubernetes2023master_kube-system(01b93443896f05f35e58e6528727fbe2)\"" pod="kube-system/kube-scheduler-kubernetes2023master" podUID=01b93443896f05f35e58e6528727fbe2
Apr 13 10:51:56 kubernetes2023master kubelet[5454]: E0413 10:51:56.539847 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CrashLoopBackOff: \"back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-kubernetes2023master_kube-system(50d99ddb3cabe509efcae42e576fd419)\"" pod="kube-system/kube-apiserver-kubernetes2023master" podUID=50d99ddb3cabe509efcae42e576fd419
Apr 13 10:51:56 kubernetes2023master kubelet[5454]: I0413 10:51:56.552678 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="1ca88116a657fc08792718629e349eda0aa822ada6c29e807f7be3f5c391ad9d"
Apr 13 10:51:56 kubernetes2023master kubelet[5454]: I0413 10:51:56.552701 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="5199c650e3cd9e01b91274cecf680ca50950794982e3d4d5ee14d0cc09c1c7ae"
Apr 13 10:51:56 kubernetes2023master kubelet[5454]: E0413 10:51:56.716946 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=etcd pod=etcd-kubernetes2023master_kube-system(9fbcb229ba4f4d1422a77bb7e8754b2f)\"" pod="kube-system/etcd-kubernetes2023master" podUID=9fbcb229ba4f4d1422a77bb7e8754b2f
Apr 13 10:51:56 kubernetes2023master kubelet[5454]: E0413 10:51:56.759854 5454 pod_workers.go:1281] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-controller-manager\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kubernetes2023master_kube-system(bf3cd71cea8fd4cc0dfa876ed8267d72)\"" pod="kube-system/kube-controller-manager-kubernetes2023master" podUID=bf3cd71cea8fd4cc0dfa876ed8267d72
Apr 13 10:51:56 kubernetes2023master kubelet[5454]: I0413 10:51:56.774392 5454 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="a45fdbbe40ec4eed666cc834335ec3541f495c4a8d657f053f04eb687d0fc269"
What user are you running this as? What are the permissions on the socket versus that user's group membership?
The /var/run/cri-dockerd.sock is in group docker The user I run on is 1000 and in the groups: lobster adm cdrom sudo dip plugdev lxd
I installed cri-dockerd as root and ran every other command as user 1000
srw-rw---- 1 root docker 0 Apr 13 11:42 /var/run/cri-dockerd.sock
drwxr-x--- 5 lobster lobster 4096 Apr 13 11:44 lobster
I0413 10:44:03.587274 4864 checks.go:833] failed to detect the sandbox image for local container runtime, output: time="2023-04-13T10:44:03Z" level=fatal msg="getting status of runtime: failed to template data: template: tmplExecuteRawJSON:1:9: executing \"tmplExecuteRawJSON\" at <.config.sandboxImage>: map has no entry for key \"config\"" , error: exit status 1
178
This has absolutely nothing to do with this.
The /var/run/cri-dockerd.sock is in group docker The user I run on is 1000 and in the groups: lobster adm cdrom sudo dip plugdev lxd
I installed cri-dockerd as root and ran every other command as user 1000
srw-rw---- 1 root docker 0 Apr 13 11:42 /var/run/cri-dockerd.sock drwxr-x--- 5 lobster lobster 4096 Apr 13 11:44 lobster
The user must be in the docker
group or otherwise have permissions to actually manipulate containers.
The user must be in the
docker
group or otherwise have permissions to actually manipulate containers.
I run kubeadm as sudo, so I think the permissions should be fine, shouldn't they? It was misleading from me to say I run every other command as user 1000. Well I did, but I used sudo :/ Sorry, that might haven't been clear. Basically you can copy-paste all commands from my initial post above. That is exactly all that I did.
1 1/2 years ago I created another Kubernetes-Cluster, when docker was still part of it and used almost the same procedure. I installed Docker, then Kubernetes and initialized with "sudo kubeadm...".
It was basically the same steps as I descriped on top of the thread, minus Enable kernel modules, Load modules, Setup required sysctl params, Reload sysctl and, obviously, Install cri-dockerd.
I also used Ubuntu 20.04 back then, could this be a problem, using 22.04 now?
I also tried setting up the current Kubernetes-Cluster with containerd, and this works actually fine. The only difference in the kubeadm output, compared to when I run it with cri-dockerd, is the error message cuisongliu cited, regarding the sandbox image.
Everything else, besides the failing of kubeadm is identical between containerd and cri-dockerd, when I initialize the cluster with kubeadm.
I wonder, has anyone of you created a new Kubernetes-Cluster from scratch since the new release of 1.27? I might try running Kubernetes 1.26, so if that fails also, it might be my setup and if it runs, it probably has something to do with 1.27 somehow.
I wonder, has anyone of you created a new Kubernetes-Cluster from scratch since the new release of 1.27?
Yes.
Where does this argument come from: --pod-network-cidr 192.168.0.0/16
. Does it conflict with the Docker network ?
Before your cluster is operational, you also need to configure CNI. But that should fail at a later stage, like with coredns.
systemctl status kubelet
I wonder, has anyone of you created a new Kubernetes-Cluster from scratch since the new release of 1.27?
Yes.
Where does this argument come from:
--pod-network-cidr 192.168.0.0/16
. Does it conflict with the Docker network ?Before your cluster is operational, you also need to configure CNI. But that should fail at a later stage, like with coredns.
As far as I know, it will set the ip-range for the CNI. I want to use calico, so I set the range myself. For flannel, it is required to be set to "10.244.0.0/16". But I'm not even able to get to install either of them, since the init itself won't finish successfully.
I tested to install cri-dockerd with Kubernetes 1.26.X and 1.25.X. Both of them run into the same error as 1.27.X, so I assume, it must be something else than the issue that cuisongliu mentioned (sandbox image config). Because 1.26 and 1.25 both didn't have that error popping up during kubeadm init, but still fail with the same results and logs from kubelet (see second post of me)
I wonder, has anyone of you created a new Kubernetes-Cluster from scratch since the new release of 1.27?
Yes. Where does this argument come from:
--pod-network-cidr 192.168.0.0/16
. Does it conflict with the Docker network ? Before your cluster is operational, you also need to configure CNI. But that should fail at a later stage, like with coredns.As far as I know, it will set the ip-range for the CNI. I want to use calico, so I set the range myself. For flannel, it is required to be set to "10.244.0.0/16". But I'm not even able to get to install either of them, since the init itself won't finish successfully.
I tested to install cri-dockerd with Kubernetes 1.26.X and 1.25.X. Both of them run into the same error as 1.27.X, so I assume, it must be something else than the issue that cuisongliu mentioned (sandbox image config). Because 1.26 and 1.25 both didn't have that error popping up during kubeadm init, but still fail with the same results and logs from kubelet (see second post of me)
Does that CIDR overlap with your normal network?
Does that CIDR overlap with your normal network?
No, there are no networks, that overlap.
But I just found out, that if I use the pre-build binaries of cri-dockerd, the kubeadm init works without problems.
That means, if you take my installation steps from the first post and replace the step Install cri-dockerd (as root) with the following steps, the kubeadm initializes the cluster:
VER=$(curl -s https://api.github.com/repos/Mirantis/cri-dockerd/releases/latest|grep tag_name | cut -d '"' -f 4|sed 's/v//g')
wget https://github.com/Mirantis/cri-dockerd/releases/download/v${VER}/cri-dockerd-${VER}.amd64.tgz
tar xvf cri-dockerd-${VER}.amd64.tgz
sudo mv cri-dockerd/cri-dockerd /usr/local/bin/
wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.service
wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.socket
sudo mv cri-docker.socket cri-docker.service /etc/systemd/system/
sudo sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
sudo systemctl daemon-reload
sudo systemctl enable cri-docker.service
sudo systemctl enable --now cri-docker.socket
The pre-build binaries have the version cri-dockerd 0.3.1 (7e528b98)
The binaries built from the README in this repository have the version cri-dockerd 0.3.1 (HEAD)
So my problem must be caused by code between 7e528b98 and HEAD.
I used cri-dockerd 0.3.1 (7e528b98)
, from cri-dockerd_0.3.1.3-0.ubuntu-focal_amd64.deb
, for the vanilla testing.
Hi,
I'm relatively new to Kubernetes and the general setup. I read some guides and the official documentations of Kubernetes / Mirantis / Docker on how to install all required components for a cluster-setup. I tried to get the Kubernetes control-plane to be setup successfully, but it doesn't work with cri-dockerd. I also tried it with the latest containerd.io and it was successful. So I figured I might ask for help in this repository, since the setup workds with containerd, but not with cri-dockerd. I wrote down all my (exact) installation steps below. After
kubeadm init ...
I run into a timeout and get an error message (described in Actual behaviour). Can you help me fixing this problem? I can provide further logs / diagnostics of my system, if you tell me where to look.Thanks in advance!
Expected behaviour The initalization runs without problems and I get to "Your Kubernetes control-plane has initialized successfully!"
Actual behaviour I get the following error message:
General information about setup
Installation steps
sudo kubeadm init --pod-network-cidr 192.168.0.0/16 --cri-socket unix:///var/run/cri-dockerd.sock