kubernetes-sigs / kind

Kubernetes IN Docker - local clusters for testing Kubernetes
https://kind.sigs.k8s.io/
Apache License 2.0
13.5k stars 1.56k forks source link

The second execution of the `kind create cluster` command failed #3489

Open AllenZMC opened 9 months ago

AllenZMC commented 9 months ago

When I used kind to create a k8s cluster successfully for the first time, and then executed kind delete cluster, finally executing the kind create cluster command again failed.

Environment:

kind version: 0.20.0

VM Info: NAME="Red Hat Enterprise Linux" VERSION="8.4 (Ootpa)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.4" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux 8.4 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8.4:GA" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/" BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8" REDHAT_BUGZILLA_PRODUCT_VERSION=8.4 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.4"

Docker Version: Client: Docker Engine - Community Version: 24.0.7 API version: 1.43 Go version: go1.20.10 Git commit: afdd53b Built: Thu Oct 26 09:09:18 2023 OS/Arch: linux/amd64 Context: default

Server: Docker Engine - Community Engine: Version: 24.0.7 API version: 1.43 (minimum version 1.12) Go version: go1.20.10 Git commit: 311b9ff Built: Thu Oct 26 09:08:20 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.26 GitCommit: 3dd1e886e55dd695541fdcd67420c2888645a495 runc: Version: 1.1.10 GitCommit: v1.1.10-0-g18a0cb0 docker-init: Version: 0.19.0 GitCommit: de40ad0

What happened:

When I execute the following command for the first time, the cluster is created successfully: kind create cluster --image=kindest/node:v1.27.3

Then, I execute the delete command kind delete cluster --name=kind All the above are successful.

But when I use kind create cluster --image=kindest/node:v1.27.3 to create cluster again, I get an error. The error log is as follows:

ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged xxxxx2222-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I0124 08:15:04.954755     205 initconfiguration.go:255] loading configuration from "/kind/kubeadm.conf"
W0124 08:15:04.955321     205 initconfiguration.go:332] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.27.3
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0124 08:15:04.960980     205 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0124 08:15:05.054638     205 certs.go:519] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost xxxxx2222-control-plane] and IPs [10.93.0.1 171.19.0.2 127.0.0.1]

...

I0124 08:16:22.031284     205 round_trippers.go:553] GET https://xxxxx2222-control-plane:6443/healthz?timeout=10s  in 0 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I0124 08:16:22.530787     205 round_trippers.go:553] GET https://xxxxx2222-control-plane:6443/healthz?timeout=10s  in 0 milliseconds

And restarting docker has no effect, but restarting the virtual machine can successfully create the cluster again.

Did kind change any linux configuration?

BenTheElder commented 9 months ago

Did kind change any linux configuration?

no.

Please check: https://kind.sigs.k8s.io/docs/user/known-issues/

A common one: https://kind.sigs.k8s.io/docs/user/known-issues/#pod-errors-due-to-too-many-open-files

Typically we see resource exhaustion issues.

Are you running anything else between these commands? Or other containers ..?

kundan2707 commented 9 months ago

@AllenZMC Are you able to find out if cause was among listed known issue ?

AllenZMC commented 9 months ago

@AllenZMC Are you able to find out if cause was among listed known issue ?

no

rapphil commented 5 months ago

Facing the same issue.

I noticed that this happens only with kind starting 0.20.0. On kind v0.19.0 if we use the same image, it will succeed.

BenTheElder commented 5 months ago

sounds like https://kind.sigs.k8s.io/docs/user/known-issues/#older-linux-distributions