Closed deitch closed 6 months ago
I completed my tests. I did the following:
/etc/kubernetes/kubelet.conf
(which is the kubelet's kubeconfig file) to point to its local IP rather than the EIPsystemctl restart kubelet
I then tried to connect to the service default/kubernetes
from various pods. Prior to these changes, this connection failed, which is why leader election failed, and hence CPEM could not recover even though it had 3 copies running.
With the above change:
But kubeadm does not support it for now. The right answer here is to follow up with kubeadm.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
As described in #304, which was resolved for most failure scenarios, there is one scenario not yet handled.
If the control plane node that has CPEM as the leader also is the node the EIP is pointing to (via API controls of EIP using CPEM, not when using BGP), and that node fails, then everything is stuck.
There are several potential solutions, looking for more or better thoughts on these.
The latter has to do with how kubeadm initiates nodes. I believe that if it changed (or had an option to do so), it would resolve it. I am going to run tests and see if that is the case.