solidnerd / terraform-k8s-hcloud

A simple project to spin your k8s cluster with terraform, kubeadm on hcloud
MIT License
89 stars 37 forks source link

Error running command - could not find a ready tiller pod #30

Open yoplait opened 2 years ago

yoplait commented 2 years ago

Hi team, looks like there are some issues with the scripts:

null_resource.kube-cni:` Provisioning with 'local-exec'...
null_resource.kube-cni (local-exec): Executing: ["/bin/sh" "-c" "KUBECONFIG=secrets/admin.conf helm install -n kube-system hcloud-csi-driver mlohr/hcloud-csi-driver --set csiDriver.secret.create=true --set csiDriver.secret.hcloudApiToken=0lQ5BEtHPUxodken3TCqF6pR7ZA112DSFmf5K71mEqM9YVUOSIiOj8Kt68LNM2bV"]
null_resource.kube-cni (local-exec): Error: could not find a ready tiller pod
╷
│ Error: local-exec provisioner error
│
│   with null_resource.kube-cni,
│   on 03-kube-post-init.tf line 59, in resource "null_resource" "kube-cni":
│   59:   provisioner "local-exec" {
│
│ Error running command 'KUBECONFIG=secrets/admin.conf helm install -n kube-system hcloud-csi-driver mlohr/hcloud-csi-driver --set csiDriver.secret.create=true --set
│ csiDriver.secret.hcloudApiToken=0lQ5BEtHPUxodken3TCqF6pR7ZA112DSFmf5K71mEqM9YVUOSIiOj8Kt68LNM2bV': exit status 1. Output: Error: could not find a ready tiller pod
│
╵

I am trying to deploy 3 masters, 2 workers clusters and looks like some issue is happening with CNI and CoreDNS pods:

| => KUBECONFIG=secrets/admin.conf kubectl get nodes
KUBECONFIG=secrets/admin.conf kubectl get pods -A -o wide
NAME                    STATUS     ROLES    AGE     VERSION
k8s-helsinki-master-1   NotReady   master   10m     v1.18.6
k8s-helsinki-master-2   NotReady   master   8m52s   v1.18.6
k8s-helsinki-master-3   NotReady   master   7m41s   v1.18.6
k8s-helsinki-node-1     NotReady   <none>   5m53s   v1.18.6
k8s-helsinki-node-2     NotReady   <none>   6m13s   v1.18.6
________________________________________________________________________________
| ~/Documents/Code/ubloquity/terraform-k8s-hetzner-DigitalOcean-Federation/hetzner_01 @ jperez-mbp (jperez)
| => KUBECONFIG=secrets/admin.conf kubectl get pods -A -o wide
NAMESPACE     NAME                                            READY   STATUS    RESTARTS   AGE     IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   coredns-66bff467f8-9rj9r                        0/1     Pending   0          10m     <none>          <none>                  <none>           <none>
kube-system   coredns-66bff467f8-qqzvp                        0/1     Pending   0          10m     <none>          <none>                  <none>           <none>
kube-system   etcd-k8s-helsinki-master-1                      1/1     Running   0          10m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-system   etcd-k8s-helsinki-master-2                      1/1     Running   0          8m48s   65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-system   etcd-k8s-helsinki-master-3                      1/1     Running   0          7m37s   65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-system   kube-apiserver-k8s-helsinki-master-1            1/1     Running   0          10m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-system   kube-apiserver-k8s-helsinki-master-2            1/1     Running   0          8m51s   65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-system   kube-apiserver-k8s-helsinki-master-3            1/1     Running   0          7m40s   65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-system   kube-controller-manager-k8s-helsinki-master-1   1/1     Running   1          10m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-system   kube-controller-manager-k8s-helsinki-master-2   1/1     Running   0          8m51s   65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-system   kube-controller-manager-k8s-helsinki-master-3   1/1     Running   0          7m41s   65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-system   kube-proxy-6mhh7                                1/1     Running   0          10m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-system   kube-proxy-fxmhr                                1/1     Running   0          7m42s   65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-system   kube-proxy-h4lt9                                1/1     Running   0          5m54s   65.21.251.5     k8s-helsinki-node-1     <none>           <none>
kube-system   kube-proxy-r85mj                                1/1     Running   0          8m52s   65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-system   kube-proxy-v2fvk                                1/1     Running   0          6m14s   65.108.86.224   k8s-helsinki-node-2     <none>           <none>
kube-system   kube-scheduler-k8s-helsinki-master-1            1/1     Running   1          10m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-system   kube-scheduler-k8s-helsinki-master-2            1/1     Running   0          8m52s   65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-system   kube-scheduler-k8s-helsinki-master-3            1/1     Running   0          7m40s   65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-system   tiller-deploy-56b574c76d-5t8bs                  0/1     Pending   0          5m43s   <none>          <none>                  <none>           <none>
kube-system   tiller-deploy-587d84cd48-jl9nl                  0/1     Pending   0          5m47s   <none>          <none>                  <none>           <none>

Any ideas?

| => KUBECONFIG=secrets/admin.conf kubectl get pods --namespace=kube-system -o wide

NAME                                            READY   STATUS    RESTARTS   AGE     IP              NODE                    NOMINATED NODE   READINESS GATES
coredns-66bff467f8-9rj9r                        0/1     Pending   0          11m     <none>          <none>                  <none>           <none>
coredns-66bff467f8-qqzvp                        0/1     Pending   0          11m     <none>          <none>                  <none>           <none>
etcd-k8s-helsinki-master-1                      1/1     Running   0          11m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
etcd-k8s-helsinki-master-2                      1/1     Running   0          10m     65.21.251.220   k8s-helsinki-master-2   <none>           <none>
etcd-k8s-helsinki-master-3                      1/1     Running   0          9m1s    65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-apiserver-k8s-helsinki-master-1            1/1     Running   0          11m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-apiserver-k8s-helsinki-master-2            1/1     Running   0          10m     65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-apiserver-k8s-helsinki-master-3            1/1     Running   0          9m4s    65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-controller-manager-k8s-helsinki-master-1   1/1     Running   1          11m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-controller-manager-k8s-helsinki-master-2   1/1     Running   0          10m     65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-controller-manager-k8s-helsinki-master-3   1/1     Running   0          9m5s    65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-proxy-6mhh7                                1/1     Running   0          11m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-proxy-fxmhr                                1/1     Running   0          9m6s    65.21.4.190     k8s-helsinki-master-3   <none>           <none>
kube-proxy-h4lt9                                1/1     Running   0          7m18s   65.21.251.5     k8s-helsinki-node-1     <none>           <none>
kube-proxy-r85mj                                1/1     Running   0          10m     65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-proxy-v2fvk                                1/1     Running   0          7m38s   65.108.86.224   k8s-helsinki-node-2     <none>           <none>
kube-scheduler-k8s-helsinki-master-1            1/1     Running   1          11m     65.21.0.135     k8s-helsinki-master-1   <none>           <none>
kube-scheduler-k8s-helsinki-master-2            1/1     Running   0          10m     65.21.251.220   k8s-helsinki-master-2   <none>           <none>
kube-scheduler-k8s-helsinki-master-3            1/1     Running   0          9m4s    65.21.4.190     k8s-helsinki-master-3   <none>           <none>
tiller-deploy-56b574c76d-5t8bs                  0/1     Pending   0          7m7s    <none>          <none>                  <none>           <none>
tiller-deploy-587d84cd48-jl9nl                  0/1     Pending   0          7m11s   <none>          <none>                  <none>           <none>

Looks like following some links: coredns-pod-is-not-running-in-kubernetes?

| => KUBECONFIG=secrets/admin.conf kubectl describe pods coredns-66bff467f8-9rj9r --namespace=kube-system
Name:                 coredns-66bff467f8-9rj9r
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 <none>
Labels:               k8s-app=kube-dns
                      pod-template-hash=66bff467f8
Annotations:          <none>
Status:               Pending
IP:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  11m                default-scheduler  0/2 nodes are available: 2 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate.
  Warning  FailedScheduling  10m                default-scheduler  0/3 nodes are available: 3 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate.
  Warning  FailedScheduling  9m7s               default-scheduler  0/5 nodes are available: 5 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate.
  Warning  FailedScheduling  9m7s               default-scheduler  0/5 nodes are available: 5 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate.
  Warning  FailedScheduling  12m (x3 over 13m)  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate.
  Warning  FailedScheduling  12m (x3 over 12m)  default-scheduler  0/2 nodes are available: 2 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate.
yoplait commented 2 years ago

Basically this doesn't work anymore with the status on the hcloud-cloud-controller-manager:

| => KUBECONFIG=secrets/admin.conf kubectl describe nodes | egrep "Taints:|Name:"
Name:               k8s-helsinki-master-1
Taints:             node-role.kubernetes.io/master:NoSchedule
  Warning  FailedToCreateRoute  55m                  route_controller  Could not create route 10893038-cf0b-4461-ac46-40d70e587670 10.244.0.0/24 for node k8s-helsinki-master-1 after 235.061594ms: hcloud/CreateRoute: hcops/AllServersCache.ByName: k8s-helsinki-master-1 hcops/AllServersCache.getCache: not found

All nodes are failing because of the network problem ....

KUBECONFIG=secrets/admin.conf kubectl get events -A -w --sort-by='{.lastTimestamp}' 

default       3m55s       Warning   SyncLoadBalancerFailed   service/tmp-web                                         Error syncing load balancer: failed to ensure load balancer: hcloud/loadBalancers.EnsureLoadBalancer: hcops/LoadBalancerOps.ReconcileHCLBTargets: hcops/providerIDToServerID: missing prefix hcloud://:
default       66s         Warning   FailedToCreateRoute      node/k8s-helsinki-node-1                                (combined from similar events): Could not create route 2eea7cc6-31c2-4cc1-88f9-e893f841222f 10.244.3.0/24 for node k8s-helsinki-node-1 after 728.214426ms: hcloud/CreateRoute: hcops/AllServersCache.ByName: k8s-helsinki-node-1 hcops/AllServersCache.getCache: not found
default       66s         Warning   FailedToCreateRoute      node/k8s-helsinki-master-3                              (combined from similar events): Could not create route a4508fa8-3ad1-4a64-be11-c182135f663d 10.244.2.0/24 for node k8s-helsinki-master-3 after 470.881203ms: hcloud/CreateRoute: hcops/AllServersCache.ByName: k8s-helsinki-master-3 hcops/AllServersCache.getCache: not found
default       66s         Warning   FailedToCreateRoute      node/k8s-helsinki-master-2                              (combined from similar events): Could not create route 7c7e89ff-3721-470e-844f-551691779032 10.244.1.0/24 for node k8s-helsinki-master-2 after 249.027669ms: hcloud/CreateRoute: hcops/AllServersCache.ByName: k8s-helsinki-master-2 hcops/AllServersCache.getCache: not found
default       55s         Warning   FailedToCreateRoute      node/k8s-helsinki-master-1                              (combined from similar events): Could not create route 10893038-cf0b-4461-ac46-40d70e587670 10.244.0.0/24 for node k8s-helsinki-master-1 after 970.318239ms: hcloud/CreateRoute: hcops/AllServersCache.ByName: k8s-helsinki-master-1 hcops/AllServersCache.getCache: not found
default       55s         Warning   FailedToCreateRoute      node/k8s-helsinki-node-2                                (combined from similar events): Could not create route 43207003-9d37-4ba8-b7db-ccb3f6bfab61 10.244.4.0/24 for node k8s-helsinki-node-2 after 764.871302ms: hcloud/CreateRoute: hcops/AllServersCache.ByName: k8s-helsinki-node-2 hcops/AllServersCache.getCache: not found
default       42s         Warning   FailedScheduling         pod/tmp-web-5cf74bc8c8-jjrpv                            0/5 nodes are available: 2 node(s) had taint {node.kubernetes.io/network-unavailable: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
default       12s         Warning   FailedScheduling         pod/tmp-web-5cf74bc8c8-k8j7n                            0/5 nodes are available: 2 node(s) had taint {node.kubernetes.io/network-unavailable: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.