kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.55k stars 1.3k forks source link

KCP reports "failed to update KubeadmControlPlane status" even though it does update the status #3021

Closed benmoss closed 4 years ago

benmoss commented 4 years ago

What steps did you take and what happened: Whenever KCP can't establish a connection to the workload cluster it will log an error saying "Failed to update KubeadmControlPlane Status". This isn't true, and we don't treat it as a terminal error, because this is only part of what updateStatus does. We still update the fields we can on the KCP object, and then we still patch the object in etcd even if we can't connect to etcd.

https://github.com/kubernetes-sigs/cluster-api/blob/master/controlplane/kubeadm/controllers/controller.go#L159-L162 https://github.com/kubernetes-sigs/cluster-api/blob/master/controlplane/kubeadm/controllers/status.go#L65-L68

What did you expect to happen: It should be an Info message, something like "Could not connect to workload cluster to fetch status".

/kind bug /area control-plane

[capi-kubeadm-control-plane-controller-manager-57cdf8f656-hkd7z manager] E0506 19:31:27.551405     887 controller.go:160] controllers/KubeadmControlPlane
"msg"="Failed to update KubeadmControlPlane Status" "error"="failed to create remote cluster client: failed to create client for workload cluster default/ben: Get \"https://ben-apiserver-1949290032.us-east-1.elb.amazonaws.com:6443/api?timeout=30s\": EOF" "cluster"="ben" "kubeadmControlPlane"="ben-control-plane" "namespace"="default"
943 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create remote cluster client: failed to create client for workload cluster default/bmo: Get \"https://bmo-apiserver-580303059.us-east-1.elb.amazonaws.com:6443/api?timeout=30s\": EOF"  "controller"="kubeadmcontrolplane" "request"={"Namespace":"default","Name":"bmo-control-plane"}
vincepri commented 4 years ago

We should definitely try to fail more gracefully here

/milestone v0.3.x