Open shekhar-rajak opened 3 years ago
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Any plan for this feature
We haven't dropped the idea. It would support useful features like cluster wide upgrades and cert rotations.
Similarly to other area of kubeadm. Someone has to have the time to work on it.
If someone starts work on it again i would like for us to discuss trimming down the heavy boilerplate that kube builder adds.
I have a prototype and many ideas around this and the kubeadm, library...
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/lifecycle frozen
/cc
I will evaluate this feature and see if we can move forward this.
https://github.com/pacoxu/kubeadm-operator/issues/53 I did some init implementation for upgrade and certs renew
.
I got a problem to implement the kubeadm operator.
systemctl restart kubelet
inside a pod for security concerns.If kubelet is not restarted, the apiserver will be the new version and kubelet will be n-1 version.
/usr/bin/kubelet
and /usr/bin/kubelet-new
are different. If they are different, it will trigger to execute systemctl stop kubelet && /usr/bin/cp /usr/bin/kubelet-new /usr/bin/kubelet && systemctl restart kubelet
. This is not that convenient.Some other thoughts, can we kill kubelet inside pod? Or can we add a flag file in hostPath to let kubelet know it should be restarted? I don't know if there is a simple way to restart kubelet by a kubelet API calling or other methods.
@pacoxu I am also working on this, all those work for my side are still under internal maintained. It's okay for you to join in but we'd better to sync with each other to avoid the duplicated effort on this.
our work is based on @fabriziopandini's POC, how about yours?
If kubelet is not restarted, the apiserver will be the new version and kubelet will be n-1 version.
This is indeed an issue, so far it is fine since kubelet 1.24 will continue to work with apisever 1.25, but this must be taken care because of the changing from cri support on docker.
our work is based on @fabriziopandini's POC, how about yours?
https://github.com/pacoxu/kubeadm-operator/issues/2 The same. The POC was removed from kubeadm code base in https://github.com/kubernetes/kubeadm/pull/2342.
cc @ruquanzhao for awareness, we need to figure out a solution on the upgrade of kubelet
as well.
big +1 for multiple people collaborating on this. happy to help with more ideas / review when needed.
Some other thoughts, can we kill kubelet inside pod? Or can we add a flag file in hostPath to let kubelet know it should be restarted? I don't know if there is a simple way to restart kubelet by a kubelet API calling or other methods.
i don't think there is an API to restart kubelet. then again, that would restart a particular kubelet and we want to upgrade the binary too.
A workaround for me, I can start a daemon process on every node to check if kubelet version of /usr/bin/kubelet and /usr/bin/kubelet-new are different. If they are different, it will trigger to execute systemctl stop kubelet && /usr/bin/cp /usr/bin/kubelet-new /usr/bin/kubelet && systemctl restart kubelet. This is not that convenient.
this sounds like one of the ways to do it. i don't think i have and significantly better ideas right now. e.g. we could (somehow) deploy scripts on the hosts that manage the kubelet restart cycle and upgrade even if the pods that the operator DS deploys are killed due to a kubelet restart.
how to restart kubelet? It is not suggested to run systemctl restart kubelet inside a pod for security concerns.
the operator will have have to have super powers on the hosts, so it will be considered as a trusted "actor"...there is no other way to manage component upgrade (kubeadm, kubelet) and cert rotation, etc.
I can build the workaround into a script.
the operator will have to have super powers on the hosts, so it will be considered as a trusted "actor"...there is no other way to manage component upgrade (kubeadm, kubelet) and cert rotation, etc.
If we restart kubelet inside pod, I tried to mount /run/systemd
& /var/run/dbus/system_bus_socket
and /sys/fs/cgroup
to the agent and set privileged/hostPID to true, but the systemctl restart kubelet
still failed. I am not sure what I am missing.
I write a simple kubelet-reloader
/usr/bin/kubelet-new
./usr/bin/kubelet
and restart kubelet.Currently the kubeadm-operator v0.1.0 can support upgrade cross versions like v1.22 to v1.24.
kubectl
/kubelet
/kubeadm
and upgrade./usr/bin/kubelet-new
for kubelet reloader.See quick-start.
Thats great. I think we should have our discussion on the k/kubeadm issue to have it in one place. Also cross coordinate with @chendave to avoid duplicated work.
https://github.com/kubernetes/kubeadm/issues/2317 may be the right place.
FYI - We have all the initial scope of the KEP https://github.com/kubernetes/enhancements/pull/1239 implemented here: https://github.com/chendave/kubeadm-operator
But it is still a just POC.
@pacoxu @neolit123 @ruquanzhao
We have a try to ask if kubeadm operator can be a sig-clusterlifecyle subproject earlier this year. Some context can be found in https://docs.google.com/document/d/1Gmc7LyCIL_148a9Tft7pdhdee0NBHdOfHS1SAF0duI4/edit#heading=h.xm2jvfwtcfuz sig clusterlifecycle weekly meeting and https://github.com/kubernetes-sigs/cluster-api/issues/7044 cluster-api gathering feedback issue.
Enhancement Description
https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/kubeadm/2505-Kubeadm-operator
Summary
Kubeadm operator would like to enable declarative control of kubeadm workflows, automating the execution and the orchestration of such tasks across existing nodes in a cluster.
Motivation
Kubeadm binary can execute operations only on the machine where it is running e.g. it is not possible to execute operations on other nodes, to copy files across nodes, etc.
As a consequence, most of the kubeadm workflows, like kubeadm upgrade, consists of a complex sequence of tasks that should be manually executed and orchestrated across all the existing nodes in the cluster.
Such a user experience is not ideal due to the error-prone nature of humans running commands. The manual approach can be considered a blocker for implementing more complex workflows such as rotating certificate authorities, modifying the settings of an existing cluster or any task that requires coordination of more than one Kubernetes node.
This KEP aims to address such problems by applying the operator pattern to kubeadm workflows.
k/enhancements
) update PR(s):k/k
) update PR(s):k/website
) update PR(s):Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.