mxschmitt / ui-driver-hetzner

Rancher UI driver for the Hetzner Cloud docker driver.
https://mxschmitt.github.io/ui-driver-hetzner
Apache License 2.0
257 stars 50 forks source link

Can't establish dialer connection: can not build dialer to [c-wd69j:m-tqr6b] #69

Closed wujood closed 4 years ago

wujood commented 4 years ago

While installing rancher on hetzner i tried to use this driver to provision the servers for a new cluster. My rancher installation is brand new and the version is the newest out there.

I created a node template for the cx21 type and created a simple cluster with one etcd/controlplane/worker, but i got the same error with bigger clusters. After creating the cluster rancher starts normally to provision the servers and in hetzner cloud i can see the new nodes beeing created, but then it gets stuck and shows

This cluster is currently Provisioning; areas that interact directly with it will not be available until the API is ready.

Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) []

on the rancher ui. I looked into the logs of the rancher container (single node installation) and found these lines:

2019/11/13 10:07:59 [INFO] Provisioning cluster [c-wd69j]
2019/11/13 10:07:59 [INFO] Creating cluster [c-wd69j]
2019/11/13 10:08:04 [INFO] kontainerdriver rancherkubernetesengine listening on address 127.0.0.1:45485
2019/11/13 10:08:04 [ERROR] Cluster c-wd69j previously failed to create
2019/11/13 10:08:04 [INFO] cluster [c-wd69j] provisioning: Initiating Kubernetes cluster
2019/11/13 10:08:04 [INFO] cluster [c-wd69j] provisioning: [certificates] Generating admin certificates and kubeconfig
2019/11/13 10:08:04 [INFO] cluster [c-wd69j] provisioning: Successfully Deployed state file at [management-state/rke/rke-548000793/cluster.rkestate]
2019/11/13 10:08:04 [INFO] kontainerdriver rancherkubernetesengine stopped
2019/11/13 10:08:04 [INFO] cluster [c-wd69j] provisioning: Building Kubernetes cluster
2019/11/13 10:08:04 [INFO] cluster [c-wd69j] provisioning: [dialer] Setup tunnel for host [78.46.164.138]
2019/11/13 10:08:04 [ERROR] cluster [c-wd69j] provisioning: Failed to set up SSH tunneling for host [78.46.164.138]: Can't establish dialer connection: can not build dialer to [c-wd69j:m-tqr6b]
2019/11/13 10:08:04 [INFO] cluster [c-wd69j] provisioning: [dialer] Setup tunnel for host [78.46.214.230]
2019/11/13 10:08:04 [ERROR] cluster [c-wd69j] provisioning: Failed to set up SSH tunneling for host [78.46.214.230]: Can't establish dialer connection: can not build dialer to [c-wd69j:m-6qscr]
2019/11/13 10:08:04 [ERROR] cluster [c-wd69j] provisioning: Removing host [78.46.164.138] from node lists
2019/11/13 10:08:04 [ERROR] cluster [c-wd69j] provisioning: Removing host [78.46.214.230] from node lists
2019/11/13 10:08:04 [ERROR] cluster [c-wd69j] provisioning: Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) []
2019/11/13 10:08:04 [ERROR] ClusterController c-wd69j [cluster-provisioner-controller] failed with : Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) []
2019-11-13 10:08:13.179671 I | mvcc: store.index: compact 73394
2019-11-13 10:08:13.184619 I | mvcc: finished scheduled compaction at 73394 (took 3.08339ms)
2019-11-13 10:13:13.190368 I | mvcc: store.index: compact 74037
2019-11-13 10:13:13.199361 I | mvcc: finished scheduled compaction at 74037 (took 4.986543ms)

( I am new to writing issues and also new to hetzner cloud. if i explained the issue not enough, please tell me and i will try my best to provide more information)

wujood commented 4 years ago

Figured it out! The problem was, that rancher was run with a wrong certificate. If someone one day has this error: Look into your *.pem files and make sure that:

ianmcfall commented 3 years ago

Figured it out! The problem was, that rancher was run with a wrong certificate. If someone one day has this error: Look into your *.pem files and make sure that:

I am facing this same issue:

Question: Where are the *.pem files that you're referencing?

joshperry commented 2 years ago

Ran into the problem, it was indeed caused by a newline at the end of my private CA cert.