cbeneke / hcloud-fip-controller

Kubernetes controller to (re-)assign floating IPs on hetzner cloud instances
Apache License 2.0
122 stars 14 forks source link

Could not find address for node #25

Closed hvanmaanen closed 4 years ago

hvanmaanen commented 4 years ago

The fip controller returns the following error on a brand new cluster.

time="2020-01-15T14:47:13Z" level=fatal msg="Could not run controller: could not get kubernetes node address: could not find address for node node-1" func="github.com/cbeneke/hcloud-fip-controller/internal/app/fipcontroller.(*Controller).onStartedLeading" file="/app/internal/app/fipcontroller/leaderelection.go:56"

The following pods are running:

NAMESPACE        NAME                                       READY   STATUS             RESTARTS   AGE
fip-controller   hcloud-fip-controller-596cbc46b8-n2g75     0/1     CrashLoopBackOff   7          20m
fip-controller   hcloud-fip-controller-596cbc46b8-wtrqp     0/1     CrashLoopBackOff   7          20m
fip-controller   hcloud-fip-controller-596cbc46b8-x6prl     0/1     CrashLoopBackOff   7          20m
kube-system      calico-kube-controllers-558ffb65c4-9p8nk   1/1     Running            0          3h38m
kube-system      calico-node-545zj                          1/1     Running            1          3h38m
kube-system      calico-node-mgj4v                          1/1     Running            1          3h38m
kube-system      coredns-58687784f9-5hk5m                   1/1     Running            0          3h37m
kube-system      coredns-58687784f9-f9gd8                   1/1     Running            0          3h37m
kube-system      dns-autoscaler-79599df498-gm5sh            1/1     Running            0          3h37m
kube-system      haproxy-node-1                             1/1     Running            0          3h37m
kube-system      kube-apiserver-master-1                    1/1     Running            0          3h39m
kube-system      kube-controller-manager-master-1           1/1     Running            0          3h39m
kube-system      kube-proxy-24s4w                           1/1     Running            0          3h38m
kube-system      kube-proxy-nrq5g                           1/1     Running            0          3h39m
kube-system      kube-scheduler-master-1                    1/1     Running            0          3h39m
kube-system      kubernetes-dashboard-556b9ff8f8-n7rhs      1/1     Running            0          3h37m
kube-system      nodelocaldns-fwgz9                         1/1     Running            0          3h37m
kube-system      nodelocaldns-h46tf                         1/1     Running            0          3h37m
metallb-system   metallb-controller-7fbbd7dcbb-d7fx6        1/1     Running            0          20m
metallb-system   metallb-speaker-82xwv                      1/1     Running            0          20m
metallb-system   metallb-speaker-lgnmk                      1/1     Running            0          20m
cbeneke commented 4 years ago

Hi, do your nodes have an external IP in their status field? (Check e.g. with kubectl get nodes -o wide if the EXTERNAL-IP field is set) And did you apply the RBAC configuration for the fip-controller serviceAccount?

Can you run the controller in debug mode and give me the logs (feel free to redact sensitive values). Therefor update your fip-controller config to e.g.

{
  "hcloud_floating_ips": [ "<hcloud-floating-ip>" ],
  "log_level": "Debug"
}
hvanmaanen commented 4 years ago

Wait, is the hetzner-cloud-controller a dependency?

cbeneke commented 4 years ago

The fip-controller is built in a way, that your nodes are deployed in a hetzner cluster. If you manage the nodes and other kubernetes objects via hetzner-cloud-controller or any other controller which does the same job does not matter, but I'd argue the hetzner-cloud-controller is a de-facto dependency. (You can run your cluster without a cloud controller, but that has other implications).

If your (external) node IPs are registered on the nodes as internal IP (which afair is the case if you have no cloud-controller deployed in the cluster, since the hostname resolves to the external IP) you can also set the node_address_type option to internal. Compare README:

NODE_ADDRESS_TYPE, default: "external"
Address type of the nodes. This might be set to internal, if your external IPs are registered as internal IPs on the node objects (e.g. if you have no cloud controller manager). Can be "external" or "internal".
hvanmaanen commented 4 years ago

Just checked, all external ip's are empty. We currently can't run the hcloud controller because we use kubespray who doesn't support the external provider flag.

We can't use internal ip's either because we run the cluster on a private network aka internal ip's are really internal ip's (10.0.0.1/8)

cbeneke commented 4 years ago

The node address is used to identify the nodes between the kubernetes node objects and the hetzner virtual machine. Depending on what the server.PublicNet object holds (cmp https://github.com/cbeneke/hcloud-fip-controller/blob/95c045c7627eb90471836b5cc1dc6cf65e82e066/internal/app/fipcontroller/hcloud.go#L42) for your nodes it should still work fine with the internal IPs. That part was written before hetzner added the private network feature, so it might be, that I have to change the logic there.

Can you give me some insight how your cluster is set up? I guess your nodes have both an external and internal IP, but for your kubernetes cluster only the internal IP is registered, right? I'll have to setup a small test-cluster in that way to check how to fix this (PR welcome if you feel comfortable writing go :) ). Hope to find some time for this in the near future

hvanmaanen commented 4 years ago

Here is the output of kubectl get nodes -o wide

NAME       STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION   CONTAINER-RUNTIME
master-1   Ready    master   16m   v1.16.3   10.0.0.10     <none>        Debian GNU/Linux 10 (buster)   4.19.0-6-amd64   docker://18.9.7
master-2   Ready    master   16m   v1.16.3   10.0.0.11     <none>        Debian GNU/Linux 10 (buster)   4.19.0-6-amd64   docker://18.9.7
master-3   Ready    master   16m   v1.16.3   10.0.0.12     <none>        Debian GNU/Linux 10 (buster)   4.19.0-6-amd64   docker://18.9.7
node-1     Ready    <none>   15m   v1.16.3   10.0.0.30     <none>        Debian GNU/Linux 10 (buster)   4.19.0-6-amd64   docker://18.9.7
node-2     Ready    <none>   14m   v1.16.3   10.0.0.31     <none>        Debian GNU/Linux 10 (buster)   4.19.0-6-amd64   docker://18.9.7

If you need more information just ask. Where can I find information about which information is available from kubernetes node objects?

hvanmaanen commented 4 years ago

If I understand correctly, there must be a new config variable: private_network: true/false. If private network is true then there must be a call to the Hetzner api that gets all internal ip's and compares them to the interal-ip of the kubernetes node.

Is this flow correct?

I will write a PR in the coming weeks to fix this problem.

cbeneke commented 4 years ago

Hey,

thanks for the Input, yes either this or add a flag to check the mapping between kubernetes and hetzner via names (but I prefer the first way because I think the second one might introduce a lot more errors).

I'd argue to use e.g. hcloud_address_type insead of private_network (comparable to the node_address_type for kubernetes)

That would be nice! Thanks! :)

cbeneke commented 4 years ago

Thinking about it... it might be sufficient to just check if the IP is either from any of the public or private networks of that server. This should be a quick fix then.

cbeneke commented 4 years ago

could you try cbeneke/hcloud-fip-controller:4004afb ?

hvanmaanen commented 4 years ago

That works like a charm, thanks for the help!

cbeneke commented 4 years ago

Thanks for the feedback, I'll make a 0.3.1 release then from this :)