sguyennet / terraform-vsphere-kubespray

Deploy a Kubernetes HA cluster on VMware vSphere
https://blog.inkubate.io/install-and-manage-automatically-a-kubernetes-cluster-on-vmware-vsphere-with-terraform-and-kubespray/
Apache License 2.0
172 stars 88 forks source link

Issue in deployment #11

Open saif-cl0ud opened 5 years ago

saif-cl0ud commented 5 years ago

Please find the log below :

TASK [kubernetes/master : kubeadm | Initialize first master] **** Friday 19 April 2019 23:41:47 +0100 (0:00:03.591) 0:08:13.711 **

TASK [kubernetes/master : kubeadm | Upgrade first master] *** Friday 19 April 2019 23:41:47 +0100 (0:00:00.229) 0:08:13.940 **

TASK [kubernetes/master : kubeadm | Enable kube-proxy] ** Friday 19 April 2019 23:41:47 +0100 (0:00:00.229) 0:08:14.170 ** FAILED - RETRYING: kubeadm | Enable kube-proxy (10 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (9 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (8 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (7 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (6 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (5 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (4 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (3 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (2 retries left). FAILED - RETRYING: kubeadm | Enable kube-proxy (1 retries left). fatal: [k8s-kubespray-master-0]: FAILED! => {"attempts": 10, "changed": false, "cmd": ["/usr/local/bin/kubeadm", "alpha", "phase", "addon", "kube-proxy", "--config=/etc/kubernetes/kubeadm-config.v1alpha3.yaml"], "delta": "0:00:00.053745", "end": "2019-04-19 23:42:40.261421", "msg": "non-zero return code", "rc": 1, "start": "2019-04-19 23:42:40.207676", "stderr": "error when creating kube-proxy service account: unable to create serviceaccount: Post https://172.16.10.249:6443/api/v1/namespaces/kube-system/serviceaccounts: EOF", "stderr_lines": ["error when creating kube-proxy service account: unable to create serviceaccount: Post https://172.16.10.249:6443/api/v1/namespaces/kube-system/serviceaccounts: EOF"], "stdout": "", "stdout_lines": []}

NO MORE HOSTS LEFT ** to retry, use: --limit @/home/administrator/terraform-vsphere-kubespray/ansible/kubespray/cluster.retry

PLAY RECAP ** k8s-kubespray-master-0 : ok=275 changed=11 unreachable=0 failed=1 k8s-kubespray-master-1 : ok=252 changed=9 unreachable=0 failed=0 k8s-kubespray-master-2 : ok=252 changed=9 unreachable=0 failed=0 k8s-kubespray-worker-0 : ok=204 changed=4 unreachable=0 failed=0 localhost : ok=1 changed=0 unreachable=0 failed=0

Friday 19 April 2019 23:42:40 +0100 (0:00:52.784) 0:09:06.955 **

kubernetes/master : kubeadm | Enable kube-proxy ----------------------------------------------------------------- 52.78s kubernetes/preinstall : Update package management cache (APT) --------------------------------------------------- 10.32s download : Download items ---------------------------------------------------------------------------------------- 7.00s gather facts from all instances ---------------------------------------------------------------------------------- 6.49s download : Sync container ---------------------------------------------------------------------------------------- 5.85s download : Download items ---------------------------------------------------------------------------------------- 4.81s container-engine/docker : ensure docker packages are installed --------------------------------------------------- 4.75s download : Sync container ---------------------------------------------------------------------------------------- 4.65s download : Download items ---------------------------------------------------------------------------------------- 4.34s download : Sync container ---------------------------------------------------------------------------------------- 4.34s download : Sync container ---------------------------------------------------------------------------------------- 4.31s download : Download items ---------------------------------------------------------------------------------------- 4.21s kubernetes/node : Enable bridge-nf-call tables ------------------------------------------------------------------- 4.19s download : Download items ---------------------------------------------------------------------------------------- 4.19s download : Sync container ---------------------------------------------------------------------------------------- 4.16s container-engine/docker : Ensure old versions of Docker are not installed. | Debian ------------------------------ 4.11s kubernetes/preinstall : Hosts | populate inventory into hosts file ----------------------------------------------- 3.67s kubernetes/master : kubeadm | Create kubeadm config -------------------------------------------------------------- 3.59s download : Sync container ---------------------------------------------------------------------------------------- 3.48s kubernetes/preinstall : Install packages requirements ------------------------------------------------------------ 3.42s administrator@ubuntu:~/terraform-vsphere-kubespray/ansible/kubespray$ ^C administrator@ubuntu:~/terraform-vsphere-kubespray/ansible/kubespray$ curl -k https://172.16.10.249:6443 curl: (35) gnutls_handshake() failed: The TLS connection was non-properly terminated. administrator@ubuntu:~/terraform-vsphere-kubespray/ansible/kubespray$ ^C administrator@ubuntu:~/terraform-vsphere-kubespray/ansible/kubespray$

saif-cl0ud commented 5 years ago

While enabling kube-proxy its throwing the error

saif-cl0ud commented 5 years ago

Out put for -----journalctl -u kubelet :

Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.535127 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.538117 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.635283 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.638271 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.735444 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.738420 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.835633 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.838578 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.935822 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.938743 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.968919 16642 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list v1.Pod: Get https://172.16.10.249:6443/api Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.970043 16642 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list v1.Node: Get https://172.16.10.249:6443/api/v1/nod Apr 20 01:10:09 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:09.971244 16642 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: Failed to list *v1.Service: Get https://172.16.10.249:6443/api/v1/ Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: W0420 01:10:10.018582 16642 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.018741 16642 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plug Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.035988 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.038966 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.136621 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.139108 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.237262 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.239729 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: I0420 01:10:10.265091 16642 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.275070 16642 datacenter.go:78] Unable to find VM by UUID. VM UUID: 420d20fb-b252-2320-a1a3-62355ab32fe0 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.275373 16642 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/ Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/ Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/ Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /usr/local/go/src/runtime/asm_amd64.s:573 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /usr/local/go/src/runtime/panic.go:502 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /usr/local/go/src/runtime/panic.go:63 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /usr/local/go/src/runtime/signal_unix.go:388 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/cloudprovider/providers/vsphere/vclib/da Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/cloudprovider/providers/vsphere/vsphere. Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:330 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:64 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:362 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1405 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wai Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wai Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /workspace/anago-v1.12.5-beta.0.98+51dd616cdd25d6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wai Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: /usr/local/go/src/runtime/asm_amd64.s:2361 Apr 20 01:10:10 k8s-kubespray-master-0 kubelet[16642]: E0420 01:10:10.337860 16642 kubelet.go:2236] node "k8s-kubespray-master-0" not found lines 1-43

sguyennet commented 5 years ago

Hi, Is the kube-api running properly? Could you check this with "docker ps". Is there an error in the kube-api logs? Best regards, Simon.

saif-cl0ud commented 5 years ago

Only etcd is running

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES af397f569683 quay.io/coreos/etcd:v3.2.24 "/usr/local/bin/etcd" 10 hours ago Up 10 hours etcd1 administrator@k8s-kubespray-master-0:~$

sguyennet commented 5 years ago

Was the kube-apiserver running at one point? (docker ps -a) If yes, what is in the logs? (journalctl -u kube-apiserver)

saif-cl0ud commented 5 years ago

Nope its not running at all and showing no entries.

saif-cl0ud commented 5 years ago

Any other steps to find the root cause for the issue

sguyennet commented 5 years ago

Which Linux distribution are you using? Is your etcd cluster running properly?

On your k8s-kubespray-master-0: $ sudo -s # export ETCDCTL_API=3 # export ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem # export ETCDCTL_CERT=/etc/ssl/etcd/ssl/member-k8s-kubespray-master-0.pem # export ETCDCTL_KEY=/etc/ssl/etcd/ssl/member-k8s-kubespray-master-0-key.pem # /usr/local/bin/etcdctl member list

saif-cl0ud commented 5 years ago

I think so its running properly. Below is the output :

/usr/local/bin/etcdctl member list

346d63ce2a2b4f8e, started, etcd2, https://172.16.10.244:2380, https://172.16.10.244:2379 7224b8794410fe0c, started, etcd1, https://172.16.10.243:2380, https://172.16.10.243:2379 97a231dc05f1705f, started, etcd3, https://172.16.10.245:2380, https://172.16.10.245:2379 root@k8s-kubespray-master-0:~#

f-cramer commented 5 years ago

Hey,

I am having the exact same issue.

Unfortunately I had to perform some changes in vsphere-kubespray.tf because the vsphere instance I am working on is not licenced to create DRS clusters. Might these be the cause for the problem?

diff --git a/vsphere-kubespray.tf b/vsphere-kubespray.tf
index cbcaadd..c1ab2be 100644
--- a/vsphere-kubespray.tf
+++ b/vsphere-kubespray.tf
@@ -19,7 +19,7 @@ data "vsphere_datacenter" "dc" {
   name = "${var.vsphere_datacenter}"
 }

-data "vsphere_compute_cluster" "cluster" {
+data "vsphere_host" "cluster" {
   name          = "${var.vsphere_drs_cluster}"
   datacenter_id = "${data.vsphere_datacenter.dc.id}"
 }
@@ -367,7 +367,7 @@ resource "vsphere_folder" "folder" {
 # Create a resource pool for the Kubernetes VMs #
 resource "vsphere_resource_pool" "resource_pool" {
   name                    = "${var.vsphere_resource_pool}"
-  parent_resource_pool_id = "${data.vsphere_compute_cluster.cluster.resource_pool_id}"
+  parent_resource_pool_id = "${data.vsphere_host.cluster.resource_pool_id}"
 }

 # Create the Kubernetes master VMs #
@@ -420,16 +420,6 @@ resource "vsphere_virtual_machine" "master" {
   depends_on = ["vsphere_virtual_machine.haproxy"]
 }

-# Create anti affinity rule for the Kubernetes master VMs #
-resource "vsphere_compute_cluster_vm_anti_affinity_rule" "master_anti_affinity_rule" {
-  count               = "${var.vsphere_enable_anti_affinity == "true" ? 1 : 0}"
-  name                = "${var.vm_name_prefix}-master-anti-affinity-rule"
-  compute_cluster_id  = "${data.vsphere_compute_cluster.cluster.id}"
-  virtual_machine_ids = ["${vsphere_virtual_machine.master.*.id}"]
-
-  depends_on = ["vsphere_virtual_machine.master"]
-}
-
 # Create the Kubernetes worker VMs #
 resource "vsphere_virtual_machine" "worker" {
   count            = "${length(var.vm_worker_ips)}"