elotl / kip

Virtual-kubelet provider running pods in cloud instances
Apache License 2.0
223 stars 14 forks source link

Virtual-kubelet removes dangling pods #107

Closed ldx closed 3 years ago

ldx commented 4 years ago

I've seen this happening a few times: I kill the VK pod, the deployment starts a new one. Usually it comes up fine and the pods running via VK are kept intact. However, once in a while (usually when it takes a bit longer for VK to start up), although VK can find the pods in the provider, when listing them in Kubernetes, they are not there. So it will think they are all dangling pods and will remove them from the provider.

Obviously this is not desirable behavior and we should investigate why this might happen, and how to fix it or work around it.

ldx commented 4 years ago

Probably related to https://kubernetes.io/docs/concepts/architecture/nodes/#condition:

Node Condition Description Ready True if the node is healthy and ready to accept pods, False if the node is not healthy and is not accepting pods, and Unknown if the node controller has not heard from the node in the last node-monitor-grace-period (default is 40 seconds)

So if Kip does not come up in 40s after a restart, the node Ready condition will change to "Unknown". That 40s seems to be in line with what I'm seeing when pods are removed after a restart.

Also see: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/

--node-monitor-grace-period duration     Default: 40s
    Amount of time which we allow running Node to be unresponsive before marking it unhealthy. Must be N times more than kubelet's nodeStatusUpdateFrequency, where N means number of retries allowed for kubelet to post node status.
ldx commented 4 years ago

We need to do some research on node leases: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/0009-node-heartbeat.md

It seems to work around the problem via allowing nodes to get a longer lease to ensure a restart of the provider won't evict pods prematurely.

jrroman commented 4 years ago

After research i have found that enabling node leases alone in VK does not bypass this issue as VK is still sending node status updates which still causes the pods to terminate prematurely.

There is an open WIP https://github.com/virtual-kubelet/virtual-kubelet/pull/883/files which will allow setting a flag to disable node status updates.

There is also another which will update the VK code base from using V1beta1 to leaseV1. That pull request can be found here: https://github.com/virtual-kubelet/virtual-kubelet/pull/880/files

jrroman commented 3 years ago

This issue is currently being addressed by the creation of the https://github.com/elotl/kip-uptime service. As VK matures towards completion of the open node lease draft PRs, we can reevaluate whether node-leases will solve this problem.

ldx commented 3 years ago

@jrroman I think we can close this one if you agree?

jrroman commented 3 years ago

@ldx I would agree with you this kind of stale now that kip-uptime exists. We can create an issue which relates to testing node leases when they become more stable to see if this eliminates our need for kip-uptime

jrroman commented 3 years ago

Closing issue refer to above comment.