deckhouse / deckhouse

Kubernetes platform from Flant
https://deckhouse.io
Other
1.09k stars 115 forks source link

Deckhouse does not work locally in kind #4746

Closed Zhbert closed 1 year ago

Zhbert commented 1 year ago

Preflight Checklist

Version

stable

Expected Behavior

Deckhouse should start in a minimal configuration on a node in kind.

Actual Behavior

Deckhouse crashes with several errors. On mac OS and Linux, the behavior is identical.

Steps To Reproduce

  1. Run the installation with this script.
  2. Most likely, at the moment of node creation, everything will fall:
    
    ✓ Ensuring node image (kindest/node:v1.23.13) đŸ–ŧ
    ✗ Preparing nodes đŸ“Ļ
    ERROR: failed to create cluster: command "docker run --name habr-test-control-plane --hostname habr-test-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER --detach --tty --label io.x-k8s.kind.cluster=habr-test --net kind --restart=on-failure:1 --init=false --publish=127.0.0.1:80:80/TCP --publish=127.0.0.1:443:443/TCP --publish=127.0.0.1:57977:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf kindest/node:v1.23.13@sha256:e7968cda1b4ff790d5b0b5b0c29bda0404cdb825fd939fe50fd5accc43e3a730" failed with error: exit status 125
    Command Output: WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
    50262dc280641b227d550949c41bd555c0adbc1190bb47a4c7156e391f0ba39f
    docker: Error response from daemon: Ports are not available: exposing port TCP 127.0.0.1:443 -> 0.0.0.0:0: not allowed as current user.
    You can enable privileged port mapping from Docker -> Settings... -> Advanced -> Enable privileged port mapping.

Error creating cluster. If error is like '...port is already allocated' or '... address already in use', then you need to free ports 80 and 443. E.g., you can find programs that use these ports using the following command:

sudo lsof -n -i TCP@0.0.0.0:80,443 -s TCP:LISTEN
* Part of the error will be solved by enabling the specified in Docker Desktop on mac OS (`Docker -> Settings... -> Advanced -> Enable privileged port mapping`). 
  * There is no need to include anything additional in Linux.
3. Then you can see the default version of the node that kind uses:

~/.kind-d8/kind create cluster Creating cluster "kind" ... ✓ Ensuring node image (kindest/node:v1.25.3) đŸ–ŧ ✓ Preparing nodes đŸ“Ļ ✓ Writing configuration 📜 ✓ Starting control-plane 🕹ī¸ ✓ Installing CNI 🔌 ✓ Installing StorageClass 💾 Set kubectl context to "kind-kind" You can now use your cluster with:

kubectl cluster-info --context kind-kind

Thanks for using kind! 😊

4. Replace it in the script, everything starts working.
5. The Deckhouse installation starts, but everything crashes at the moment of Ingress deployment:

│ ┌ Create deckhouse.io/v1, Kind=IngressNginxController resources │ │ Manifest for IngressNginxController nginx │ │ 🎉 Succeeded! │ └ Create deckhouse.io/v1, Kind=IngressNginxController resources (0.02 seconds) └ â›ĩ ~ Bootstrap: Create resources (0.18 seconds) -n Waiting for the Ingress controller to be ready... -n . -n . -n . -n . ... Timeout waiting for the creation of Ingress controller!

Here is the output of the 'kubectl -n d8-ingress-nginx get ds controller-nginx -o wide' command: Error from server (NotFound): daemonsets.apps "controller-nginx" not found

Here is the output of the 'kubectl -n d8-ingress-nginx get po -l app=controller' command: No resources found in d8-ingress-nginx namespace.

If the controller-nginx Pod is in the ContainerCreating status, you most likely have a slow connection. If so, wait a little longer until the controller-nginx Pod becomes Ready. After that, run the following command to get the admin password for Grafana: 'kubectl --context kind-habr -n d8-system exec deploy/deckhouse -- sh -c "deckhouse-controller module values prometheus -o json | jq -r '.prometheus.internal.auth.password'"

A few logs:

kubectl -n d8-ingress-nginx get ds controller-nginx -o wide Error from server (NotFound): daemonsets.apps "controller-nginx" not found

kubectl -n d8-ingress-nginx get po NAME READY STATUS RESTARTS AGE kruise-controller-manager-5bb6f77678-9hwms 0/1 CrashLoopBackOff 9 (2m11s ago) 29m

kubectl -n d8-ingress-nginx logs -l app=controller -c controller No resources found in d8-ingress-nginx namespace.

I0523 14:39:24.026673 1 feature_gate.go:245] feature gates: &{map[PodWebhook:false ResourcesDeletionProtection:true]} I0523 14:39:24.030222 1 feature_gate.go:245] feature gates: &{map[KruisePodReadinessGate:false PodWebhook:false ResourcesDeletionProtection:true]} I0523 14:39:24.030430 1 feature_gate.go:245] feature gates: &{map[KruisePodReadinessGate:false PodWebhook:false ResourcesDeletionProtection:false]} I0523 14:39:24.030523 1 feature_gate.go:245] feature gates: &{map[KruisePodReadinessGate:false PodUnavailableBudgetDeleteGate:false PodWebhook:false ResourcesDeletionProtection:false]} I0523 14:39:24.030625 1 feature_gate.go:245] feature gates: &{map[KruisePodReadinessGate:false PodUnavailableBudgetDeleteGate:false PodUnavailableBudgetUpdateGate:false PodWebhook:false ResourcesDeletionProtection:false]} I0523 14:39:24.030743 1 feature_gate.go:245] feature gates: &{map[KruisePodReadinessGate:false PodUnavailableBudgetDeleteGate:false PodUnavailableBudgetUpdateGate:false PodWebhook:false ResourcesDeletionProtection:false WorkloadSpread:false]} I0523 14:39:24.030850 1 feature_gate.go:245] feature gates: &{map[KruisePodReadinessGate:false PodUnavailableBudgetDeleteGate:false PodUnavailableBudgetUpdateGate:false PodWebhook:false ResourcesDeletionProtection:false SidecarSetPatchPodMetadataDefaultsAllowed:false WorkloadSpread:false]} I0523 14:39:24.051774 1 deleg.go:130] setup "msg"="new clientset registry" I0523 14:39:24.166206 1 deleg.go:130] controller-runtime/metrics "msg"="Metrics server is starting to listen" "addr"=":8080" I0523 14:39:24.169834 1 deleg.go:130] setup "msg"="register field index" W0523 14:39:24.172519 1 mutation_detector.go:53] Mutation detector is enabled, this will result in memory leakage. W0523 14:39:24.173720 1 mutation_detector.go:53] Mutation detector is enabled, this will result in memory leakage. W0523 14:39:24.174097 1 mutation_detector.go:53] Mutation detector is enabled, this will result in memory leakage. I0523 14:39:24.174252 1 deleg.go:130] setup "msg"="setup webhook" I0523 14:39:24.177732 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-deployment" I0523 14:39:24.178561 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-cloneset" I0523 14:39:24.178949 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-daemonset" I0523 14:39:24.179285 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-broadcastjob" I0523 14:39:24.179631 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-replicaset" I0523 14:39:24.179917 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-imagepulljob" I0523 14:39:24.180237 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-podprobemarker" I0523 14:39:24.180565 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-statefulset" I0523 14:39:24.180918 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-advancedcronjob" I0523 14:39:24.181249 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-nodeimage" I0523 14:39:24.181565 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-persistentpodstate" I0523 14:39:24.181928 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-containerrecreaterequest" I0523 14:39:24.182260 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-sidecarset" I0523 14:39:24.182609 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-statefulset" I0523 14:39:24.182939 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-statefulset" I0523 14:39:24.183223 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-daemonset" I0523 14:39:24.183527 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-imagepulljob" I0523 14:39:24.183852 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-advancedcronjob" I0523 14:39:24.184160 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-customresourcedefinition" I0523 14:39:24.184510 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-nodeimage" I0523 14:39:24.184860 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-policy-kruise-io-podunavailablebudget" I0523 14:39:24.185221 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-uniteddeployment" I0523 14:39:24.185542 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-workloadspread" I0523 14:39:24.185878 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-broadcastjob" I0523 14:39:24.186183 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-cloneset" I0523 14:39:24.186500 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-namespace" I0523 14:39:24.186886 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-resourcedistribution" I0523 14:39:24.187276 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-apps-kruise-io-v1alpha1-sidecarset" I0523 14:39:24.187648 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-apps-kruise-io-v1alpha1-uniteddeployment" I0523 14:39:24.188000 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/convert" I0523 14:39:24.188377 1 server.go:145] controller-runtime/webhook "msg"="Registering webhook" "path"="/healthz" I0523 14:39:24.188471 1 deleg.go:130] setup "msg"="initialize webhook" W0523 14:39:24.193750 1 mutation_detector.go:53] Mutation detector is enabled, this will result in memory leakage. W0523 14:39:24.195729 1 mutation_detector.go:53] Mutation detector is enabled, this will result in memory leakage. W0523 14:39:24.196051 1 mutation_detector.go:53] Mutation detector is enabled, this will result in memory leakage. W0523 14:39:24.196775 1 mutation_detector.go:53] Mutation detector is enabled, this will result in memory leakage. I0523 14:39:24.197192 1 webhook_controller.go:181] Starting webhook-controller I0523 14:39:24.197507 1 shared_informer.go:240] Waiting for caches to sync for webhook-controller I0523 14:39:24.245949 1 webhook_controller.go:117] MutatingWebhookConfiguration kruise-mutating-webhook-configuration added I0523 14:39:24.253017 1 webhook_controller.go:134] ValidatingWebhookConfiguration kruise-validating-webhook-configuration added I0523 14:39:24.253336 1 webhook_controller.go:100] Secret kruise-webhook-certs added I0523 14:39:24.628901 1 webhook_controller.go:153] CustomResourceDefinition daemonsets.apps.kruise.io added I0523 14:39:24.701975 1 shared_informer.go:247] Caches are synced for webhook-controller I0523 14:39:24.702186 1 webhook_controller.go:196] Started webhook-controller I0523 14:39:24.705622 1 webhook_controller.go:222] Starting to sync webhook certs and configurations I0523 14:39:24.729273 1 fs.go:126] cert directory doesn't exist, creatingdirectory/tmp/kruise-webhook-certs I0523 14:39:24.732464 1 webhook_controller.go:224] Finished to sync webhook certs and configurations E0523 14:39:24.732799 1 webhook_controller.go:215] sync "" failed with failed to write certs to dir: can't create dir: /tmp/kruise-webhook-certs

...

E0523 14:39:24.744524 1 webhook_controller.go:215] sync "" failed with failed to write certs to dir: can't create dir: /tmp/kruise-webhook-certs I0523 14:39:24.755660 1 webhook_controller.go:222] Starting to sync webhook certs and configurations I0523 14:39:24.759448 1 fs.go:126] cert directory doesn't exist, creatingdirectory/tmp/kruise-webhook-certs I0523 14:39:24.759788 1 webhook_controller.go:224] Finished to sync webhook certs and configurations E0523 14:39:24.759892 1 webhook_controller.go:215] sync "" failed with failed to write certs to dir: can't create dir: /tmp/kruise-webhook-certs I0523 14:39:24.780862 1 webhook_controller.go:222] Starting to sync webhook I0523 14:39:25.075018 1 webhook_controller.go:222] Starting to sync webhook certs and configurationcreatingdirectory/tmp/kruise-webhook-certs I0523 14:39:25.404949 1 webhook_controller.go:224] Finished to sync webhook certs and configurations E0523 14:39:25.404985 1 webhook_controller.go:215] sync "" failed with failed to write certs to dir: can't create dir: /tmp/kruise-webhook-certs I0523 14:39:26.045955 1 webhook_controller.go:222] Starting to sync webhook certs and configurations I0523 14:39:26.050256 1 fs.go:126] cert directory doesn't exist, creatingdirectory/tmp/kruise-webhook-certs I0523 14:39:26.051895 1 webhook_controller.go:224] Finished to sync webhook certs and configurations E0523 14:39:26.051938 1 webhook_controller.go:215] sync "" failed with failed to write certs to dir: can't create dir: /tmp/kruise-webhook-certs I0523 14:39:27.333285 1 webhook_controller.go:222] Starting to sync webhook certs and configurations I0523 14:39:27.338730 1 fs.go:126] cert directory doesn't exist, creatingdirectory/tmp/kruise-webhook-certs I0523 14:39:27.340523 1 webhook_controller.go:224] Finished to sync webhook certs and configurations E0523 14:39:27.340586 1 webhook_controller.go:215] sync "" failed with failed to write certs to dir: can't create dir: /tmp/kruise-webhook-certs I0523 14:39:29.902092 1 webhook_controller.go:222] Starting to sync webhook certs and configurations I0523 14:39:29.912307 1 fs.go:126] cert directory doesn't exist, creatingdirectory/tmp/kruise-webhook-certs I0523 14:39:29.914174 1 webhook_controller.go:224] Finished to sync webhook certs and configurations E0523 14:39:29.914300 1 webhook_controller.go:215] sync "" failed with failed to write certs to dir: can't create dir: /tmp/kruise-webhook-certs I0523 14:39:35.034961 1 webhook_controller.go:222] Starting to sync webhook certs and configurations I0523 14:39:35.039956 1 fs.go:126] cert directory doesn't exist, creatingdirectory/tmp/kruise-webhook-certs I0523 14:39:35.040120 1 webhook_controller.go:224] Finished to sync webhook certs and configurations E0523 14:39:35.040155 1 webhook_controller.go:215] sync "" failed with failed to write certs to dir: can't create dir: /tmp/kruise-webhook-certs



### Additional Information

At the moment of installing kind, the script cannot create a directory for it in the home directory, asks for elevated rights, but still cannot. The solution is to create a directory manually beforehand: `mkdir ~/.kind-d8`.

### Logs

_No response_
z9r5 commented 1 year ago

Ref: https://github.com/deckhouse/deckhouse/issues/3482

z9r5 commented 1 year ago

Fixed in #5576