sguyennet / terraform-vsphere-kubespray

Deploy a Kubernetes HA cluster on VMware vSphere
https://blog.inkubate.io/install-and-manage-automatically-a-kubernetes-cluster-on-vmware-vsphere-with-terraform-and-kubespray/
Apache License 2.0
174 stars 89 forks source link

FAILED - RETRYING: kubeadm | Initialize first master #34

Open juancyepes opened 3 years ago

juancyepes commented 3 years ago

Hi. I am trying to deploy the Cluster using

Terraform apply

However, I get the following error:

null_resource.kubespray_create (local-exec): FAILED - RETRYING: kubeadm | Initialize first master (1 retries left). null_resource.kubespray_create: Still creating... (24m50s elapsed) null_resource.kubespray_create: Still creating... (25m0s elapsed) null_resource.kubespray_create: Still creating... (25m10s elapsed) null_resource.kubespray_create: Still creating... (25m20s elapsed) null_resource.kubespray_create: Still creating... (25m30s elapsed) null_resource.kubespray_create: Still creating... (25m40s elapsed) null_resource.kubespray_create: Still creating... (25m50s elapsed) null_resource.kubespray_create: Still creating... (26m0s elapsed) null_resource.kubespray_create: Still creating... (26m10s elapsed) null_resource.kubespray_create: Still creating... (26m20s elapsed) null_resource.kubespray_create: Still creating... (26m30s elapsed) null_resource.kubespray_create: Still creating... (26m40s elapsed) null_resource.kubespray_create: Still creating... (26m50s elapsed) null_resource.kubespray_create: Still creating... (27m0s elapsed) null_resource.kubespray_create: Still creating... (27m10s elapsed) null_resource.kubespray_create: Still creating... (27m20s elapsed) null_resource.kubespray_create: Still creating... (27m30s elapsed) null_resource.kubespray_create: Still creating... (27m40s elapsed) null_resource.kubespray_create: Still creating... (27m50s elapsed) null_resource.kubespray_create: Still creating... (28m0s elapsed) null_resource.kubespray_create: Still creating... (28m10s elapsed) null_resource.kubespray_create: Still creating... (28m20s elapsed) null_resource.kubespray_create: Still creating... (28m30s elapsed) null_resource.kubespray_create: Still creating... (28m40s elapsed) null_resource.kubespray_create: Still creating... (28m50s elapsed) null_resource.kubespray_create: Still creating... (29m0s elapsed) null_resource.kubespray_create: Still creating... (29m10s elapsed) null_resource.kubespray_create: Still creating... (29m20s elapsed) null_resource.kubespray_create: Still creating... (29m30s elapsed) null_resource.kubespray_create: Still creating... (29m40s elapsed) null_resource.kubespray_create: Still creating... (29m50s elapsed)

null_resource.kubespray_create (local-exec): TASK [kubernetes/master : kubeadm | Initialize first master] *** null_resource.kubespray_create (local-exec): fatal: [k8s-kubespray-master-0]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["timeout", "-k", "300s", "300s", "/usr/local/bin/kubeadm", "init", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--skip-phases=addon/coredns", "--upload-certs"], "delta": "0:05:00.010346", "end": "2021-01-06 09:43:30.880889", "failed_when_result": true, "msg": "non-zero return code", "rc": 124, "start": "2021-01-06 09:38:30.870543", "stderr": "W0106 09:38:30.922188 20824 utils.go:69] The recommended value for \"clusterDNS\" in \"KubeletConfiguration\" is: [10.233.0.10]; the provided value is: [169.254.25.10]\nW0106 09:38:31.084211 20824 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]\n\t[WARNING Port-6443]: Port 6443 is in use\n\t[WARNING Port-10259]: Port 10259 is in use\n\t[WARNING Port-10257]: Port 10257 is in use\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists\n\t[WARNING Port-10250]: Port 10250 is in use", "stderr_lines": ["W0106 09:38:30.922188 20824 utils.go:69] The recommended value for \"clusterDNS\" in \"KubeletConfiguration\" is: [10.233.0.10]; the provided value is: [169.254.25.10]", "W0106 09:38:31.084211 20824 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]", "\t[WARNING Port-6443]: Port 6443 is in use", "\t[WARNING Port-10259]: Port 10259 is in use", "\t[WARNING Port-10257]: Port 10257 is in use", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists", "\t[WARNING Port-10250]: Port 10250 is in use"], "stdout": "[init] Using Kubernetes version: v1.19.2\n[preflight] Running pre-flight checks\n[preflight] Pulling images required for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"\n[certs] Using existing ca certificate authority\n[certs] Using existing apiserver certificate and key on disk\n[certs] Using existing apiserver-kubelet-client certificate and key on disk\n[certs] Using existing front-proxy-ca certificate authority\n[certs] Using existing front-proxy-client certificate and key on disk\n[certs] External etcd mode: Skipping etcd/ca certificate authority generation\n[certs] External etcd mode: Skipping etcd/server certificate generation\n[certs] External etcd mode: Skipping etcd/peer certificate generation\n[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation\n[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation\n[certs] Using the existing \"sa\" key\n[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Starting the kubelet\n[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"\n[control-plane] Creating static Pod manifest for \"kube-apiserver\"\n[control-plane] Creating static Pod manifest for \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-scheduler\"\n[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s\n[kubelet-check] Initial timeout of 40s passed.", "stdout_lines": ["[init] Using Kubernetes version: v1.19.2", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two, depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"", "[certs] Using existing ca certificate authority", "[certs] Using existing apiserver certificate and key on disk", "[certs] Using existing apiserver-kubelet-client certificate and key on disk", "[certs] Using existing front-proxy-ca certificate authority", "[certs] Using existing front-proxy-client certificate and key on disk", "[certs] External etcd mode: Skipping etcd/ca certificate authority generation", "[certs] External etcd mode: Skipping etcd/server certificate generation", "[certs] External etcd mode: Skipping etcd/peer certificate generation", "[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation", "[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation", "[certs] Using the existing \"sa\" key", "[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"", "[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"", "[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"", "[kubelet-start] Starting the kubelet", "[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"", "[control-plane] Creating static Pod manifest for \"kube-apiserver\"", "[control-plane] Creating static Pod manifest for \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-scheduler\"", "[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s", "[kubelet-check] Initial timeout of 40s passed."]}

null_resource.kubespray_create (local-exec): NO MORE HOSTS LEFT *****

null_resource.kubespray_create (local-exec): PLAY RECAP ***** null_resource.kubespray_create (local-exec): k8s-kubespray-haproxy-0 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 null_resource.kubespray_create (local-exec): k8s-kubespray-haproxy-1 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 null_resource.kubespray_create (local-exec): k8s-kubespray-master-0 : ok=492 changed=94 unreachable=0 failed=1 skipped=570 rescued=0 ignored=0 null_resource.kubespray_create (local-exec): k8s-kubespray-master-1 : ok=452 changed=91 unreachable=0 failed=0 skipped=485 rescued=0 ignored=0 null_resource.kubespray_create (local-exec): k8s-kubespray-master-2 : ok=452 changed=91 unreachable=0 failed=0 skipped=485 rescued=0 ignored=0 null_resource.kubespray_create (local-exec): k8s-kubespray-worker-0 : ok=309 changed=67 unreachable=0 failed=0 skipped=348 rescued=0 ignored=0 null_resource.kubespray_create (local-exec): k8s-kubespray-worker-1 : ok=309 changed=67 unreachable=0 failed=0 skipped=348 rescued=0 ignored=0 null_resource.kubespray_create (local-exec): k8s-kubespray-worker-2 : ok=309 changed=67 unreachable=0 failed=0 skipped=348 rescued=0 ignored=0 null_resource.kubespray_create (local-exec): localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

null_resource.kubespray_create (local-exec): Wednesday 06 January 2021 09:43:30 -0500 (0:20:16.003) 0:29:49.118 ***** null_resource.kubespray_create (local-exec): =============================================================================== null_resource.kubespray_create (local-exec): kubernetes/master : kubeadm | Initialize first master ---------------- 1216.00s null_resource.kubespray_create (local-exec): container-engine/docker : ensure docker packages are installed --------- 59.22s null_resource.kubespray_create (local-exec): kubernetes/preinstall : Install packages requirements ------------------ 28.88s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------ 17.76s null_resource.kubespray_create (local-exec): kubernetes/preinstall : Update package management cache (APT) ---------- 17.28s null_resource.kubespray_create (local-exec): Gen_certs | Write etcd master certs ------------------------------------ 14.68s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------ 14.13s null_resource.kubespray_create (local-exec): Gen_certs | Write etcd master certs ------------------------------------ 14.09s null_resource.kubespray_create (local-exec): container-engine/docker : ensure docker-ce repository is enabled ------- 12.56s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------ 11.37s null_resource.kubespray_create (local-exec): reload etcd ------------------------------------------------------------ 10.74s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------ 10.71s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------- 9.80s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------- 9.65s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------- 9.48s null_resource.kubespray_create (local-exec): download_file | Download item ------------------------------------------- 8.60s null_resource.kubespray_create (local-exec): download_file | Download item ------------------------------------------- 8.19s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------- 8.15s null_resource.kubespray_create (local-exec): download_container | Download image if required ------------------------- 7.85s null_resource.kubespray_create (local-exec): download_file | Download item ------------------------------------------- 7.14s

Error: Error applying plan:

1 error(s) occurred:

NO MORE HOSTS LEFT *****

PLAY RECAP ***** k8s-kubespray-haproxy-0 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 k8s-kubespray-haproxy-1 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 k8s-kubespray-master-0 : ok=492 changed=94 unreachable=0 failed=1 skipped=570 rescued=0 ignored=0 k8s-kubespray-master-1 : ok=452 changed=91 unreachable=0 failed=0 skipped=485 rescued=0 ignored=0 k8s-kubespray-master-2 : ok=452 changed=91 unreachable=0 failed=0 skipped=485 rescued=0 ignored=0 k8s-kubespray-worker-0 : ok=309 changed=67 unreachable=0 failed=0 skipped=348 rescued=0 ignored=0 k8s-kubespray-worker-1 : ok=309 changed=67 unreachable=0 failed=0 skipped=348 rescued=0 ignored=0 k8s-kubespray-worker-2 : ok=309 changed=67 unreachable=0 failed=0 skipped=348 rescued=0 ignored=0 localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

Wednesday 06 January 2021 09:43:30 -0500 (0:20:16.003) 0:29:49.118 *****

kubernetes/master : kubeadm | Initialize first master ---------------- 1216.00s container-engine/docker : ensure docker packages are installed --------- 59.22s kubernetes/preinstall : Install packages requirements ------------------ 28.88s download_container | Download image if required ------------------------ 17.76s kubernetes/preinstall : Update package management cache (APT) ---------- 17.28s Gen_certs | Write etcd master certs ------------------------------------ 14.68s download_container | Download image if required ------------------------ 14.13s Gen_certs | Write etcd master certs ------------------------------------ 14.09s container-engine/docker : ensure docker-ce repository is enabled ------- 12.56s download_container | Download image if required ------------------------ 11.37s reload etcd ------------------------------------------------------------ 10.74s download_container | Download image if required ------------------------ 10.71s download_container | Download image if required ------------------------- 9.80s download_container | Download image if required ------------------------- 9.65s download_container | Download image if required ------------------------- 9.48s download_file | Download item ------------------------------------------- 8.60s download_file | Download item ------------------------------------------- 8.19s download_container | Download image if required ------------------------- 8.15s download_container | Download image if required ------------------------- 7.85s download_file | Download item ------------------------------------------- 7.14s

Terraform does not automatically rollback in the face of errors. Instead, your Terraform state file has been partially updated with any resources that successfully completed. Please address the error above and apply again to incrementally change your infrastructure.

thomasdbaker commented 2 years ago

Make sure that the KeepAliveD nic matches in the config and that you can ping the VIP on your haproxy nodes.