The following logs represent an error when kube-eleven can't remove a specific control-plane node from the k8s cluster.
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:02 CEST" level=info msg="Determine hostname..."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:03 CEST" level=info msg="Determine operating system..."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:04 CEST" level=info msg="Running host probes..."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:06 CEST" level=info msg="Electing cluster leader..."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:06 CEST" level=info msg="Elected leader \"gcp-kube-nodes-c09hagz-01\"..."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:08 CEST" level=info msg="Building Kubernetes clientset..."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:08 CEST" level=info msg="Running cluster probes..."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:09 CEST" level=error msg="Host \"oci-kube-nodes-s6toh60-01\" is broken and needs to be manually removed\n"
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:09 CEST" level=warning msg="Hosts must be removed in a correct order to preserve the Etcd quorum."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:09 CEST" level=warning msg="Loss of the Etcd quorum can cause loss of all data!!!"
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:09 CEST" level=warning msg="After removing the recommended hosts, run 'kubeone apply' before removing any other host."
ts4-c-1-cluster-test-set-no4-cejmofr time="09:01:09 CEST" level=warning msg="No other broken node can be removed without losing quorum."
ts4-c-1-cluster-test-set-no4-cejmofr Error: configuration broken hosts check
ts4-c-1-cluster-test-set-no4-cejmofr broken host(s) found, remove it manually
In some cases, it eventually succeeds in removing the control plane node after a couple of retries, however, in many it runs out of retries and fails.
Expected Behaviour
kube-eleven shouldn't fail in electing a leader or removing a control plane node when using a proxy.
Current Behaviour
ansibler
restartskube-apiserver
and other static pods when updating no proxy envs (see https://github.com/berops/claudie/blob/master/services/ansibler/server/ansible-playbooks/update-noproxy-envs.yml#L10). In the next phasekube-eleven
can't elect a leader for a quorum (see https://github.com/berops/claudie/issues/1515) or remove a control-plane node becausekube-apiserver
pod is down on each control-plane node.The following logs represent an error when
kube-eleven
can't remove a specific control-plane node from the k8s cluster.In some cases, it eventually succeeds in removing the control plane node after a couple of retries, however, in many it runs out of retries and fails.
Expected Behaviour
kube-eleven
shouldn't fail in electing a leader or removing a control plane node when using a proxy.Steps To Reproduce
kube-eleven