kubernetes-sigs / kind

Kubernetes IN Docker - local clusters for testing Kubernetes
https://kind.sigs.k8s.io/
Apache License 2.0
13.36k stars 1.55k forks source link

failed to create cluster: failed to init node with kubeadm, and KIND_EXPERIMENTAL_PROVIDER=podman #3581

Closed KubeKyrie closed 5 months ago

KubeKyrie commented 5 months ago

What happened:

kind export logs: logs.tar.gz

When running kind create cluster without any other config, it failed. And I have checked https://kind.sigs.k8s.io/docs/user/known-issues/, but no similar issue was found.

And I have tried, v1.26.4 and v1.27.3 also have this error, by --image kindest/node:v1.26.4@sha256xxx. But v1.26.2 is ok.

error logs:

enabling experimental podman provider
Creating cluster "kind" ...
âĸŽâĄ€ Ensuring node image (kindest/node:v1.27.1) đŸ–ŧ
 ✓ Ensuring node image (kindest/node:v1.27.1) đŸ–ŧ
 ✓ Preparing nodes đŸ“Ļ
 ✓ Writing configuration 📜
 ✗ Starting control-plane 🕹ī¸
Deleted nodes: ["kind-control-plane"]
ERROR: failed to create cluster: failed to init node with kubeadm: command "podman exec --privileged kind-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I0416 09:58:19.166054     252 initconfiguration.go:255] loading configuration from "/kind/kubeadm.conf"
W0416 09:58:19.168950     252 initconfiguration.go:332] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.27.1
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0416 09:58:19.198741     252 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0416 09:58:19.624630     252 certs.go:519] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kind-control-plane kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost] and IPs [10.96.0.1 10.89.0.2 127.0.0.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0416 09:58:20.302075     252 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I0416 09:58:20.637068     252 certs.go:519] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I0416 09:58:21.353022     252 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I0416 09:58:21.565145     252 certs.go:519] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kind-control-plane localhost] and IPs [10.89.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kind-control-plane localhost] and IPs [10.89.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0416 09:58:23.513008     252 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0416 09:58:23.766844     252 kubeconfig.go:103] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0416 09:58:23.944150     252 kubeconfig.go:103] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0416 09:58:24.239178     252 kubeconfig.go:103] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0416 09:58:24.634047     252 kubeconfig.go:103] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
I0416 09:58:24.944819     252 kubelet.go:67] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0416 09:58:25.419502     252 manifests.go:99] [control-plane] getting StaticPodSpecs
I0416 09:58:25.420944     252 certs.go:519] validating certificate period for CA certificate
I0416 09:58:25.421080     252 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0416 09:58:25.421092     252 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0416 09:58:25.421103     252 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0416 09:58:25.421111     252 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0416 09:58:25.421120     252 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0416 09:58:25.429012     252 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
I0416 09:58:25.429067     252 manifests.go:99] [control-plane] getting StaticPodSpecs
I0416 09:58:25.429862     252 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I0416 09:58:25.429881     252 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I0416 09:58:25.429890     252 manifests.go:125] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I0416 09:58:25.429897     252 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I0416 09:58:25.429905     252 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I0416 09:58:25.429912     252 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I0416 09:58:25.429919     252 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0416 09:58:25.432116     252 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
I0416 09:58:25.432157     252 manifests.go:99] [control-plane] getting StaticPodSpecs
I0416 09:58:25.432670     252 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I0416 09:58:25.434022     252 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
W0416 09:58:25.434418     252 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.1, falling back to the nearest etcd version (3.5.7-0)
I0416 09:58:25.436095     252 local.go:65] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I0416 09:58:25.436189     252 waitcontrolplane.go:83] [wait-control-plane] Waiting for the API server to be healthy
I0416 09:58:25.437110     252 loader.go:373] Config loaded from file:  /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I0416 09:58:25.448895     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 4 milliseconds
I0416 09:58:25.950968     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 1 milliseconds
I0416 09:58:26.451138     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 1 milliseconds
I0416 09:58:26.951183     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 1
I0416 10:02:25.451866     252 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 1 milliseconds

Unfortunately, an error has occurred:
    timed out waiting for the condition

This error is likely caused by:
    - The kubelet is not running
    - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
    - 'systemctl status kubelet'
    - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
    - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
    Once you have found the failing container, you can inspect its logs with:
    - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
    cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
    cmd/kubeadm/app/cmd/phases/workflow/runner.go:259
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
    cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
    cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
    cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
    vendor/github.com/spf13/cobra/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
    vendor/github.com/spf13/cobra/command.go:1040
github.com/spf13/cobra.(*Command).Execute
    vendor/github.com/spf13/cobra/command.go:968
k8s.io/kubernetes/cmd/kubeadm/app.Run
    cmd/kubeadm/app/kubeadm.go:50
main.main
    cmd/kubeadm/kubeadm.go:25
runtime.main
    /usr/local/go/src/runtime/proc.go:250
runtime.goexit
    /usr/local/go/src/runtime/asm_amd64.s:1598
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
    cmd/kubeadm/app/cmd/phases/workflow/runner.go:260
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
    cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
    cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
    cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
    vendor/github.com/spf13/cobra/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
    vendor/github.com/spf13/cobra/command.go:1040
github.com/spf13/cobra.(*Command).Execute
    vendor/github.com/spf13/cobra/command.go:968
k8s.io/kubernetes/cmd/kubeadm/app.Run
    cmd/kubeadm/app/kubeadm.go:50
main.main
    cmd/kubeadm/kubeadm.go:25
runtime.main
    /usr/local/go/src/runtime/proc.go:250
runtime.goexit
    /usr/local/go/src/runtime/asm_amd64.s:1598

What you expected to happen: Kind cluster can be created successfully.

How to reproduce it (as minimally and precisely as possible): Just run kind create cluster. Anything else we need to know?:

Environment:

CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"



- Kubernetes version: (use `kubectl version`): v1.27.1
- Any proxies or other special environment settings?: None
stmcginnis commented 5 months ago

Can you fill in more of the "Environment" details from the end of the issue template? That can contain some useful information about your environment to help understand what is going on.

You are also using an older version of kind. So first thing to try would be to upgrade to the latest. Then make sure you are using one of the supported k8s release versions for that release.

Once you upgrade you can run kind create cluster --retain if creation fails. Then kind export logs will get you all of the log output from the creation process where you can track down what is failing. https://kind.sigs.k8s.io/docs/user/known-issues/#troubleshooting-kind

/remove-kind bug /kind support

KubeKyrie commented 5 months ago

Hi @stmcginnis, thanks for your advice. And I have added more information about this error in the issue, could you help find out what the problem is?

Next I will upgrade kind and try again.

aojea commented 5 months ago

you need to use the latest stable version and check the images you are using matches the version as per the release notes

You are also using cgroupsv1 that have several open issues, you should also discard that is the problem

BenTheElder commented 5 months ago

RHEL 7 is https://github.com/kubernetes-sigs/kind/issues/3311#issuecomment-2060835007

I highly recommend using a more recent distro to develop Kubernetes