kubermatic / kubeone

Kubermatic KubeOne automate cluster operations on all your cloud, on-prem, edge, and IoT environments.
https://kubeone.io
Apache License 2.0
1.36k stars 231 forks source link

Kubeone on Hetzner is not coming up... #1588

Closed exocode closed 2 years ago

exocode commented 2 years ago

I try to install Kubeone on Hetzner. I did not get Kubeone up and running. It keeps hanging atWaiting for nodes to initialize by CCM...

EDIT: What I found out was, that I can access the cluster. But the Hetzner hcloud-cloud-controller-manager did not deployed correctly:

kube-system pod/hcloud-cloud-controller-manager-648985cfc4-hs672 0/1 CrashLoopBackOff 19 (112s ago) 74m kube-system deployment.apps/hcloud-cloud-controller-manager 0/1 1 0 74m kube-system service/node-local-dns ClusterIP None <none> 9253/TCP 74m

EDIT: 2 when I do kubeone config print after a kubeone install -m kubeone.yaml -t tf.json I get a TOTALLY DIFFERENT output than my provided manifest kubeone.yaml:

❯ kubeone config print
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
name: demo-cluster
versions:
  kubernetes: 1.18.2
cloudProvider:
  aws: {}
apiEndpoint:
  port: 6443

This is everything what I am doing:

cd into the examples/terraform/hetzner/ folder

curl -sfL get.kubeone.io | sh

echo "HCLOUD_TOKEN: XXXXXXX" >> credentials.yml
terraform init
terraform plan
terraform output -json > tf.json

eval `ssh-agent`
ssh-add ~/.ssh/id_rsa
kubeone install -m kubeone.yaml -t tf.json -c credentials.yml --debug --verbose (find errs and warnings output below ***)

That is my kubeone.yaml:

apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
name: cluster-one
versions:
  kubernetes: "1.22.2"
cloudProvider:
  hetzner: {}
  external: true
 ´´´

and this my credentials.yml

```yaml
HCLOUD_TOKEN: MY_TOKEN_HERE

That is the terminal output:


INFO[19:32:36 CEST] Determine hostname...
INFO[19:32:36 CEST] Determine operating system...
INFO[19:32:36 CEST] Running host probes...
INFO[19:32:37 CEST] Installing prerequisites...
INFO[19:32:37 CEST] Creating environment file...                  node=49.12.192.225 os=ubuntu
INFO[19:32:37 CEST] Creating environment file...                  node=23.88.107.100 os=ubuntu
INFO[19:32:37 CEST] Creating environment file...                  node=23.88.108.93 os=ubuntu
INFO[19:32:37 CEST] Configuring proxy...                          node=23.88.107.100 os=ubuntu
INFO[19:32:37 CEST] Installing kubeadm...                         node=23.88.107.100 os=ubuntu
INFO[19:32:37 CEST] Configuring proxy...                          node=23.88.108.93 os=ubuntu
INFO[19:32:37 CEST] Installing kubeadm...                         node=23.88.108.93 os=ubuntu
INFO[19:32:37 CEST] Configuring proxy...                          node=49.12.192.225 os=ubuntu
INFO[19:32:37 CEST] Installing kubeadm...                         node=49.12.192.225 os=ubuntu
INFO[19:33:57 CEST] Generating kubeadm config file...
INFO[19:33:58 CEST] Uploading config files...                     node=49.12.192.225
INFO[19:33:58 CEST] Uploading config files...                     node=23.88.108.93
INFO[19:33:58 CEST] Uploading config files...                     node=23.88.107.100
INFO[19:34:00 CEST] Configuring certs and etcd on control plane node...
INFO[19:34:00 CEST] Ensuring Certificates...                      node=23.88.108.93
INFO[19:34:02 CEST] Downloading PKI...
INFO[19:34:02 CEST] Creating local backup...                      node=23.88.108.93
INFO[19:34:02 CEST] Uploading PKI...
INFO[19:34:05 CEST] Configuring certs and etcd on consecutive control plane node...
INFO[19:34:05 CEST] Ensuring Certificates...                      node=49.12.192.225
INFO[19:34:05 CEST] Ensuring Certificates...                      node=23.88.107.100
INFO[19:34:07 CEST] Initializing Kubernetes on leader...
INFO[19:34:07 CEST] Running kubeadm...                            node=23.88.108.93
INFO[19:35:25 CEST] Building Kubernetes clientset...
INFO[19:35:25 CEST] Check if cluster needs any repairs...
INFO[19:35:26 CEST] Joining controlplane node...
INFO[19:35:26 CEST] Waiting 15s to ensure main control plane components are up...  node=23.88.107.100
INFO[19:35:41 CEST] Joining control plane node                    node=23.88.107.100
INFO[19:36:27 CEST] Waiting 15s to ensure main control plane components are up...  node=49.12.192.225
INFO[19:36:42 CEST] Joining control plane node                    node=49.12.192.225
INFO[19:37:16 CEST] Restarting unhealthy API servers if needed...
INFO[19:37:37 CEST] Patching static pods...
INFO[19:37:37 CEST] Patching static pods...
INFO[19:37:37 CEST] Patching static pods...
INFO[19:37:38 CEST] Downloading kubeconfig...
INFO[19:37:38 CEST] Downloading PKI...
INFO[19:37:38 CEST] Creating local backup...                      node=23.88.108.93
INFO[19:37:38 CEST] Ensure node local DNS cache...
INFO[19:37:40 CEST] Activating additional features...
INFO[19:37:43 CEST] Patching coreDNS with uninitialized toleration...
INFO[19:37:46 CEST] Creating credentials secret...
INFO[19:37:47 CEST] Ensure external CCM is up to date...
INFO[19:37:49 CEST] Waiting for nodes to initialize by CCM...
WARN[19:47:49 CEST] Task failed, error was: failed waiting for nodes to be initialized by CCM: timed out waiting for the condition
WARN[19:47:54 CEST] Retrying task...
INFO[19:47:54 CEST] Ensure external CCM is up to date...
INFO[19:47:56 CEST] Waiting for nodes to initialize by CCM...
WARN[19:57:56 CEST] Task failed, error was: failed waiting for nodes to be initialized by CCM: timed out waiting for the condition
WARN[19:58:06 CEST] Retrying task...
INFO[19:58:06 CEST] Ensure external CCM is up to date...
INFO[19:58:08 CEST] Waiting for nodes to initialize by CCM...
WARN[20:08:08 CEST] Task failed, error was: failed waiting for nodes to be initialized by CCM: timed out waiting for the condition
WARN[20:08:28 CEST] Retrying task...
INFO[20:08:28 CEST] Ensure external CCM is up to date...
INFO[20:08:30 CEST] Waiting for nodes to initialize by CCM...
WARN[20:18:30 CEST] Task failed, error was: failed waiting for nodes to be initialized by CCM: timed out waiting for the condition
WARN[20:19:10 CEST] Retrying task...
INFO[20:19:10 CEST] Ensure external CCM is up to date...
INFO[20:19:12 CEST] Waiting for nodes to initialize by CCM...
❯ kubeone version
{
  "kubeone": {
    "major": "1",
    "minor": "3",
    "gitVersion": "1.3.0",
    "gitCommit": "bfe6683334acdbb1a1d9cbbb2d5d5095f6f0111e",
    "gitTreeState": "",
    "buildDate": "2021-09-15T06:03:30Z",
    "goVersion": "go1.16.7",
    "compiler": "gc",
    "platform": "darwin/amd64"
  },
  "machine_controller": {
    "major": "1",
    "minor": "35",
    "gitVersion": "v1.35.2",
    "gitCommit": "",
    "gitTreeState": "",
    "buildDate": "",
    "goVersion": "",
    "compiler": "",
    "platform": "linux/amd64"
  }
}

BTW: After a half hour i opened a second terminal window and call kubeone status:

❯ kubeone status -m kubeone.yaml -t tf.json -c credentials.yml

INFO[23:48:19 CEST] Determine hostname...
INFO[23:48:20 CEST] Determine operating system...
INFO[23:48:21 CEST] Building Kubernetes clientset...
INFO[23:48:21 CEST] Verifying that nodes in the cluster match nodes defined in the manifest...
INFO[23:48:21 CEST] Verifying that all nodes in the cluster are ready...
INFO[23:48:21 CEST] Verifying that there is no upgrade in progress...
NODE                        VERSION   APISERVER   ETCD
cloud-one-control-plane-1   v1.22.2   healthy     healthy
cloud-one-control-plane-2   v1.22.2   healthy     healthy
cloud-one-control-plane-3   v1.22.2   healthy     healthy

(which looks working, but the first terminal window is still in progress:

INFO[23:32:56 CEST] Waiting for nodes to initialize by CCM...
WARN[23:42:56 CEST] Task failed, error was: failed waiting for nodes to be initialized by CCM: timed out waiting for the condition
WARN[23:43:01 CEST] Retrying task...
INFO[23:43:01 CEST] Ensure external CCM is up to date...
INFO[23:43:03 CEST] Waiting for nodes to initialize by CCM...

kubectl get all --all-namespaces
NAMESPACE     NAME                                                    READY   STATUS             RESTARTS        AGE
kube-system   pod/calico-kube-controllers-78d6f96c7b-qvdhv            0/1     Pending            0               74m
kube-system   pod/canal-5b6sn                                         2/2     Running            0               74m
kube-system   pod/canal-5nb8b                                         2/2     Running            0               74m
kube-system   pod/canal-c4mcv                                         2/2     Running            0               74m
kube-system   pod/coredns-86886dc5b6-qtkb6                            1/1     Running            0               74m
kube-system   pod/coredns-86886dc5b6-vp256                            1/1     Running            0               74m
kube-system   pod/etcd-cloud-one-control-plane-1                      1/1     Running            0               77m
kube-system   pod/etcd-cloud-one-control-plane-2                      1/1     Running            0               76m
kube-system   pod/etcd-cloud-one-control-plane-3                      1/1     Running            0               75m
kube-system   pod/hcloud-cloud-controller-manager-648985cfc4-hs672    0/1     CrashLoopBackOff   19 (112s ago)   74m
kube-system   pod/kube-apiserver-cloud-one-control-plane-1            1/1     Running            1               77m
kube-system   pod/kube-apiserver-cloud-one-control-plane-2            1/1     Running            0               76m
kube-system   pod/kube-apiserver-cloud-one-control-plane-3            1/1     Running            0               75m
kube-system   pod/kube-controller-manager-cloud-one-control-plane-1   1/1     Running            0               74m
kube-system   pod/kube-controller-manager-cloud-one-control-plane-2   1/1     Running            0               74m
kube-system   pod/kube-controller-manager-cloud-one-control-plane-3   1/1     Running            0               74m
kube-system   pod/kube-proxy-47lg7                                    1/1     Running            0               77m
kube-system   pod/kube-proxy-sh54p                                    1/1     Running            0               76m
kube-system   pod/kube-proxy-wdz74                                    1/1     Running            0               75m
kube-system   pod/kube-scheduler-cloud-one-control-plane-1            1/1     Running            2 (75m ago)     77m
kube-system   pod/kube-scheduler-cloud-one-control-plane-2            1/1     Running            0               76m
kube-system   pod/kube-scheduler-cloud-one-control-plane-3            1/1     Running            0               75m
kube-system   pod/metrics-server-6bd949fccd-j4xmj                     1/1     Running            0               74m
kube-system   pod/node-local-dns-8t5pf                                1/1     Running            0               74m
kube-system   pod/node-local-dns-gjkrl                                1/1     Running            0               74m
kube-system   pod/node-local-dns-w9vg5                                1/1     Running            0               74m

NAMESPACE     NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
default       service/kubernetes          ClusterIP   10.96.0.1        <none>        443/TCP                  77m
kube-system   service/kube-dns            ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP,9153/TCP   77m
kube-system   service/kube-dns-upstream   ClusterIP   10.104.254.219   <none>        53/UDP,53/TCP            74m
kube-system   service/metrics-server      ClusterIP   10.97.228.198    <none>        443/TCP                  74m
kube-system   service/node-local-dns      ClusterIP   None             <none>        9253/TCP                 74m

NAMESPACE     NAME                            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   daemonset.apps/canal            3         3         3       3            3           kubernetes.io/os=linux   74m
kube-system   daemonset.apps/kube-proxy       3         3         3       3            3           kubernetes.io/os=linux   77m
kube-system   daemonset.apps/node-local-dns   3         3         3       3            3           <none>                   74m

NAMESPACE     NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/calico-kube-controllers           0/1     1            0           74m
kube-system   deployment.apps/coredns                           2/2     2            2           77m
kube-system   deployment.apps/hcloud-cloud-controller-manager   0/1     1            0           74m
kube-system   deployment.apps/metrics-server                    1/1     1            1           74m

NAMESPACE     NAME                                                         DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/calico-kube-controllers-78d6f96c7b           1         1         0       74m
kube-system   replicaset.apps/coredns-78fcd69978                           0         0         0       77m
kube-system   replicaset.apps/coredns-86886dc5b6                           2         2         2       74m
kube-system   replicaset.apps/hcloud-cloud-controller-manager-648985cfc4   1         1         0       74m
kube-system   replicaset.apps/metrics-server-6bd949fccd                    1         1         1       74m
k describe po -n kube-system hcloud-cloud-controller-manager-648985cfc4-hs672
Name:         hcloud-cloud-controller-manager-648985cfc4-hs672
Namespace:    kube-system
Priority:     0
Node:         cloud-one-control-plane-1/192.168.0.4
Start Time:   Sat, 23 Oct 2021 23:33:16 +0200
Labels:       app=hcloud-cloud-controller-manager
              pod-template-hash=648985cfc4
Annotations:  scheduler.alpha.kubernetes.io/critical-pod:
Status:       Running
IP:           192.168.0.4
IPs:
  IP:           192.168.0.4
Controlled By:  ReplicaSet/hcloud-cloud-controller-manager-648985cfc4
Containers:
  hcloud-cloud-controller-manager:
    Container ID:  containerd://07cf86cd6703a49e8cb6a5cdfc044a67bfd6d7a16fd74b63ea3093db1e5b5113
    Image:         docker.io/hetznercloud/hcloud-cloud-controller-manager:v1.9.1
    Image ID:      docker.io/hetznercloud/hcloud-cloud-controller-manager@sha256:a48e153d1e692a6f66a00217db53f89ea9740238e1367c23975ffa526e9c74e7
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/hcloud-cloud-controller-manager
      --cloud-provider=hcloud
      --leader-elect=false
      --allow-untagged-cloud
      --allocate-node-cidrs=true
      --cluster-cidr=10.244.0.0/16
    State:          Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Sun, 24 Oct 2021 00:50:56 +0200
      Finished:     Sun, 24 Oct 2021 00:50:56 +0200
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Sun, 24 Oct 2021 00:45:45 +0200
      Finished:     Sun, 24 Oct 2021 00:45:47 +0200
    Ready:          False
    Restart Count:  20
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      NODE_NAME:                              (v1:spec.nodeName)
      HCLOUD_TOKEN:                          <set to the key 'HZ_TOKEN' in secret 'cloud-provider-credentials'>  Optional: false
      HCLOUD_LOAD_BALANCERS_USE_PRIVATE_IP:  true
      HCLOUD_NETWORK:                        1247449
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-bkg9s (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-bkg9s:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 CriticalAddonsOnly op=Exists
                             node-role.kubernetes.io/control-plane:NoSchedule
                             node-role.kubernetes.io/master:NoSchedule
                             node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule
                             node.kubernetes.io/not-ready:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                    From     Message
  ----     ------   ----                   ----     -------
  Warning  BackOff  2m46s (x342 over 77m)  kubelet  Back-off restarting failed container

*** These are (filtered) errors and warnings when using kubeone install --verbose --debug:

[49.12.209.6] Warning: apt-key output should not be parsed (stdout is not a terminal)
...
[49.12.209.6] + sudo apt-key add -
[49.12.209.6] Warning: apt-key output should not be parsed (stdout is not a terminal)
[49.12.209.6] OK
...
[49.12.213.77] Warning: apt-key output should not be parsed (stdout is not a terminal)
...
[157.90.251.185] Warning: apt-key output should not be parsed (stdout is not a terminal)
...
[49.12.213.77] Warning: apt-key output should not be parsed (stdout is not a terminal)
...
[157.90.251.185] Warning: apt-key output should not be parsed (stdout is not a terminal)
...
[49.12.209.6] update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
...
[49.12.213.77] update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
...
[157.90.251.185] update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
...
[157.90.251.185] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, 
...
Kind=JoinConfiguration
...
[49.12.213.77] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, Kind=JoinConfiguration
...
[49.12.209.6] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, Kind=JoinConfiguration
...
[49.12.213.77] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, Kind=JoinConfiguration
...
[49.12.209.6] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, Kind=JoinConfiguration
...
[157.90.251.185] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, Kind=JoinConfiguration
...
INFO[10:58:58 CEST] Ensure node local DNS cache...
...
INFO[10:58:59 CEST] Parsing addons manifest 'nodelocaldns.yaml'
+ sudo KUBECONFIG=/etc/kubernetes/admin.conf \
kubectl apply -f - --prune -l "kubeone.io/addon=nodelocaldns
Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
configmap/node-local-dns created
serviceaccount/node-local-dns created
service/kube-dns-upstream created
daemonset.apps/node-local-dns created
service/node-local-dns created
...
INFO[10:59:01 CEST] Activating additional features...
...
INFO[10:59:01 CEST] Parsing addons manifest 'metrics-server.yaml'
+ sudo KUBECONFIG=/etc/kubernetes/admin.conf \
kubectl apply -f - --prune -l "kubeone.io/addon=metrics-server"

Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
serviceaccount/metrics-server created
.....
...
INFO[10:59:03 CEST] Patching coreDNS with uninitialized toleration...
...
INFO[10:59:03 CEST] Parsing addons manifest 'canal.yaml'
+ sudo KUBECONFIG=/etc/kubernetes/admin.conf \
kubectl apply -f - --prune -l "kubeone.io/addon=cni-canal"
Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
configmap/canal-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
...
INFO[10:59:08 CEST] Parsing addons manifest 'ccm-hetzner.yaml'
+ sudo KUBECONFIG=/etc/kubernetes/admin.conf \
kubectl apply -f - --prune -l "kubeone.io/addon=ccm-hetzner"

Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
serviceaccount/cloud-controller-manager created
clusterrolebinding.rbac.authorization.k8s.io/system:cloud-controller-manager created
deployment.apps/hcloud-cloud-controller-manager created
INFO[10:59:10 CEST] Waiting for nodes to initialize by CCM...
WARN[11:09:10 CEST] Task failed, error was: failed waiting for nodes to be initialized by CCM: timed out waiting for the condition
WARN[11:09:15 CEST] Retrying task...
INFO[11:09:15 CEST] Ensure external CCM is up to date...
INFO[11:09:16 CEST] Parsing addons manifest 'ccm-hetzner.yaml'
+ sudo KUBECONFIG=/etc/kubernetes/admin.conf \
kubectl apply -f - --prune -l "kubeone.io/addon=ccm-hetzner"

Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
serviceaccount/cloud-controller-manager unchanged
clusterrolebinding.rbac.authorization.k8s.io/system:cloud-controller-manager unchanged
deployment.apps/hcloud-cloud-controller-manager unchanged
INFO[11:09:18 CEST] Waiting for nodes to initialize by CCM...
exocode commented 2 years ago

I found the issue:

in my credentials.yaml I missed a apostroph:


HCLOUD_TOKEN: mysecrets"
             ^-------------------- missed that apostrophe here

First barrier mastered. Next step: installing Kubermatic. 🚀