hetznercloud / hcloud-cloud-controller-manager

Kubernetes cloud-controller-manager for Hetzner Cloud
Apache License 2.0
703 stars 112 forks source link

1.6.0 does not work with networks and weave-net #52

Closed ckotzbauer closed 4 years ago

ckotzbauer commented 4 years ago

Hi all, I tried for mutiple hours to update the cloud-provider from 1.5.2 to 1.6.0 (always with fresh clusters), but it didn't work. All nodes are always marked as NodeNetworkUnavailable: true. If I switch back to 1.5.2 without changing more than that, it works.

My setup:

Is there anything I missed there?

LKaemmerling commented 4 years ago

Could you provide logs of from the cloud controller pod within kube-system?

ckotzbauer commented 4 years ago

Yes, of course. Sorry, I forgot.

Flag --allow-untagged-cloud has been deprecated, This flag is deprecated and will be removed in a future release. A cluster-id will be required on cloud instances.
I0624 07:26:32.300949       1 serving.go:313] Generated self-signed cert in-memory
W0624 07:26:32.681722       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0624 07:26:32.687888       1 controllermanager.go:120] Version: v0.0.0-master+$Format:%h$
Hetzner Cloud k8s cloud controller v1.6.0 started
W0624 07:26:33.198797       1 controllermanager.go:132] detected a cluster without a ClusterID.  A ClusterID will be required in the future.  Please tag your cluster to avoid any future issues
I0624 07:26:33.202566       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0624 07:26:33.202865       1 shared_informer.go:223] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0624 07:26:33.202611       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0624 07:26:33.202899       1 shared_informer.go:223] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0624 07:26:33.202774       1 secure_serving.go:178] Serving securely on [::]:10258
I0624 07:26:33.208965       1 controllermanager.go:247] Started "service"
I0624 07:26:33.209087       1 controller.go:208] Starting service controller
I0624 07:26:33.209094       1 shared_informer.go:223] Waiting for caches to sync for service
I0624 07:26:33.202785       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0624 07:26:33.299459       1 controllermanager.go:247] Started "route"
I0624 07:26:33.302380       1 route_controller.go:100] Starting route controller
I0624 07:26:33.302404       1 shared_informer.go:223] Waiting for caches to sync for route
I0624 07:26:33.303812       1 shared_informer.go:230] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0624 07:26:33.303888       1 node_controller.go:110] Sending events to api server.
I0624 07:26:33.303958       1 controllermanager.go:247] Started "cloud-node"
I0624 07:26:33.305344       1 node_lifecycle_controller.go:78] Sending events to api server
I0624 07:26:33.305385       1 controllermanager.go:247] Started "cloud-node-lifecycle"
I0624 07:26:33.305827       1 shared_informer.go:230] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0624 07:26:33.371297       1 node_controller.go:325] Initializing node kubeworker3 with cloud provider
I0624 07:26:33.402526       1 shared_informer.go:230] Caches are synced for route 
I0624 07:26:33.409215       1 shared_informer.go:230] Caches are synced for service 
I0624 07:26:33.733006       1 route_controller.go:269] node kubeworker1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:33.733034       1 route_controller.go:303] Patching node status kubeworker1 with false previous condition was:nil
I0624 07:26:33.738508       1 route_controller.go:269] node kubeworker3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:33.738528       1 route_controller.go:303] Patching node status kubeworker3 with false previous condition was:nil
I0624 07:26:33.738790       1 route_controller.go:269] node kubeworker2 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:33.738796       1 route_controller.go:303] Patching node status kubeworker2 with false previous condition was:nil
I0624 07:26:33.738942       1 route_controller.go:269] node kubemaster1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:33.738947       1 route_controller.go:303] Patching node status kubemaster1 with false previous condition was:nil
I0624 07:26:34.813292       1 node_controller.go:397] Successfully initialized node kubeworker3 with cloud provider
I0624 07:26:34.813312       1 node_controller.go:325] Initializing node kubeworker2 with cloud provider
I0624 07:26:36.414639       1 node_controller.go:397] Successfully initialized node kubeworker2 with cloud provider
I0624 07:26:36.414670       1 node_controller.go:325] Initializing node kubemaster1 with cloud provider
I0624 07:26:37.908757       1 node_controller.go:397] Successfully initialized node kubemaster1 with cloud provider
I0624 07:26:37.909000       1 node_controller.go:325] Initializing node kubeworker1 with cloud provider
E0624 07:26:39.199489       1 node_lifecycle_controller.go:155] error checking if node kubeworker1 is shutdown: hcloud/instances.InstanceShutdownByProviderID: hcloud/providerIDToServerID: missing prefix hcloud://: 
E0624 07:26:39.375971       1 node_lifecycle_controller.go:172] error checking if node kubeworker1 exists: hcloud/instances.InstanceExistsByProviderID: hcloud/providerIDToServerID: missing prefix hcloud://: 6407101
I0624 07:26:39.484968       1 node_controller.go:397] Successfully initialized node kubeworker1 with cloud provider
I0624 07:26:39.485982       1 node_controller.go:325] Initializing node kubeworker1 with cloud provider
I0624 07:26:39.503153       1 node_controller.go:325] Initializing node kubemaster1 with cloud provider
I0624 07:26:39.509307       1 node_controller.go:325] Initializing node kubeworker2 with cloud provider
I0624 07:26:39.519563       1 node_controller.go:325] Initializing node kubeworker3 with cloud provider
I0624 07:26:39.531442       1 node_controller.go:325] Initializing node kubeworker1 with cloud provider
I0624 07:26:39.540495       1 node_controller.go:325] Initializing node kubemaster1 with cloud provider
I0624 07:26:39.546957       1 node_controller.go:325] Initializing node kubeworker2 with cloud provider
I0624 07:26:39.552350       1 node_controller.go:325] Initializing node kubeworker3 with cloud provider
I0624 07:26:39.555982       1 node_controller.go:325] Initializing node kubeworker1 with cloud provider
I0624 07:26:43.749754       1 route_controller.go:269] node kubeworker3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:43.750589       1 route_controller.go:269] node kubeworker2 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:43.750922       1 route_controller.go:269] node kubemaster1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:43.751417       1 route_controller.go:269] node kubeworker1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:47.781524       1 node_controller.go:325] Initializing node kubemaster3 with cloud provider
I0624 07:26:49.534750       1 node_controller.go:397] Successfully initialized node kubemaster3 with cloud provider
I0624 07:26:49.535117       1 node_controller.go:325] Initializing node kubemaster3 with cloud provider
I0624 07:26:49.567382       1 node_controller.go:325] Initializing node kubemaster3 with cloud provider
I0624 07:26:49.603467       1 node_controller.go:325] Initializing node kubemaster3 with cloud provider
I0624 07:26:49.628214       1 node_controller.go:325] Initializing node kubemaster2 with cloud provider
I0624 07:26:51.215036       1 node_controller.go:397] Successfully initialized node kubemaster2 with cloud provider
I0624 07:26:51.215371       1 node_controller.go:325] Initializing node kubemaster2 with cloud provider
I0624 07:26:51.264505       1 node_controller.go:325] Initializing node kubemaster2 with cloud provider
I0624 07:26:51.272617       1 node_controller.go:325] Initializing node kubemaster2 with cloud provider
I0624 07:26:53.827477       1 route_controller.go:269] node kubemaster3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:53.827844       1 route_controller.go:303] Patching node status kubemaster3 with false previous condition was:nil
I0624 07:26:53.828640       1 route_controller.go:269] node kubemaster2 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:53.833111       1 route_controller.go:303] Patching node status kubemaster2 with false previous condition was:nil
I0624 07:26:53.833525       1 route_controller.go:269] node kubeworker3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:53.833643       1 route_controller.go:303] Patching node status kubeworker3 with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2020-06-24 07:26:47 +0000 UTC,LastTransitionTime:2020-06-24 07:26:47 +0000 UTC,Reason:WeaveIsUp,Message:Weave pod has set this,}
I0624 07:26:53.834055       1 route_controller.go:269] node kubeworker2 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:53.834144       1 route_controller.go:303] Patching node status kubeworker2 with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2020-06-24 07:26:47 +0000 UTC,LastTransitionTime:2020-06-24 07:26:47 +0000 UTC,Reason:WeaveIsUp,Message:Weave pod has set this,}
I0624 07:26:53.834427       1 route_controller.go:269] node kubemaster1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:26:53.834549       1 route_controller.go:303] Patching node status kubemaster1 with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2020-06-24 07:26:46 +0000 UTC,LastTransitionTime:2020-06-24 07:26:46 +0000 UTC,Reason:WeaveIsUp,Message:Weave pod has set this,}
I0624 07:26:53.834829       1 route_controller.go:269] node kubeworker1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:03.817709       1 route_controller.go:269] node kubemaster2 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:03.817774       1 route_controller.go:269] node kubeworker3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:03.817783       1 route_controller.go:269] node kubeworker2 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:03.817790       1 route_controller.go:269] node kubemaster1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:03.817797       1 route_controller.go:269] node kubeworker1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:03.817803       1 route_controller.go:269] node kubemaster3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:13.802460       1 route_controller.go:269] node kubeworker1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:13.803209       1 route_controller.go:303] Patching node status kubeworker1 with false previous condition was:&NodeCondition{Type:NetworkUnavailable,Status:False,LastHeartbeatTime:2020-06-24 07:27:12 +0000 UTC,LastTransitionTime:2020-06-24 07:27:12 +0000 UTC,Reason:WeaveIsUp,Message:Weave pod has set this,}
I0624 07:27:13.807169       1 route_controller.go:269] node kubemaster3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:13.807489       1 route_controller.go:269] node kubemaster2 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:13.807660       1 route_controller.go:269] node kubeworker3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:13.807859       1 route_controller.go:269] node kubeworker2 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:13.808001       1 route_controller.go:269] node kubemaster1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:23.847019       1 route_controller.go:269] node kubeworker1 has no routes assigned to it. NodeNetworkUnavailable will be set to true
I0624 07:27:23.847651       1 route_controller.go:269] node kubemaster3 has no routes assigned to it. NodeNetworkUnavailable will be set to true
...
LKaemmerling commented 4 years ago
error checking if node kubeworker1 exists: hcloud/instances.InstanceExistsByProviderID: hcloud/providerIDToServerID: missing prefix hcloud://: 6407101

Indicated that you missed initializing the cluster with :

[Service]
Environment="KUBELET_EXTRA_ARGS=--cloud-provider=external"

(Step 1 from the Deployment in the Readme Section: https://github.com/hetznercloud/hcloud-cloud-controller-manager)

And you try to create routes with an invalid IP range. All Networks need to be within an IP from RFC1918 and the Pod Subnet (service-cidr) should also be with this network.

ckotzbauer commented 4 years ago

Indicated that you missed initializing the cluster with :

[Service]
Environment="KUBELET_EXTRA_ARGS=--cloud-provider=external"

This is definitely the case. As mentioned I use the hcloud-controller for months now without problems.

Yes, there were invalid IPs in this exampe. Normally (before trying to update) I used this ranges (this always worked): cluster-cidr: 10.32.0.0/12 service-network: 10.96.0.0/12 hcloud-network: 10.0.1.0/24

If I change the hcloud-network (with the above ranges) to 10.0.0.0/8 (as mentioned in the docs) weave complaints about overlapping networks. With the original "10.0.1.0/24" nodes are marked as "NodeNetworkUnavailable"

LKaemmerling commented 4 years ago

Okay, i did a bit of research. It looks like Weave is not capable of using CloudControllers as a source for the routes/IPs. Could use try using Cilium? (We use Cilium everywhere too)

ckotzbauer commented 4 years ago

Hm, it seems I found a fix (with weave). It works when I added the podSubnet explicitly

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
networking:
  podSubnet: "10.244.0.0/16"

I thought this range was the default-range for pods, but maybe I was wrong here. So I think we can close this issue.

Thanks @LKaemmerling for your time and research!