kubernetes-sigs / descheduler

Descheduler for Kubernetes
https://sigs.k8s.io/descheduler
Apache License 2.0
4.36k stars 662 forks source link

Possible support for evicting pending pods that are stuck. #1183

Open kannon92 opened 1 year ago

kannon92 commented 1 year ago

I have a question if descheduler would be a good place to add removing pending pods if they are stuck.

I work on a batch project and we experience a lot of cases where pods can get stuck due to configuration errors. I originally posted a github issue on k/k hoping we could evict these pods in k/k. I was curious if this could be in scope for descheduler.

Some context for the reader

Generally I am working on a KEP to represent pods that are stuck due to configuration issues. And I would also like to consider options for how to evict these pods. The main complication is that false conditions can be BAU so we were thinking we would want a timeout and eventually evict if the condition matches a bad state for x amount of time.

For descheduler, I just want to know if this is possible in scope as a feature ask?

damemi commented 1 year ago

We have PodLifetime, which considers pending pods, but only if they have already been scheduled (see https://github.com/kubernetes-sigs/descheduler/issues/858 and https://github.com/kubernetes-sigs/descheduler/pull/846#discussion_r899217024).

I think there have been other similar request to evict non-scheduled pods. But I've held the opinion that it's not really de"scheduling" if the pod isn't scheduled to a node in the first place.

Our code right now basically only looks at pods that are already on a node, as far as I recall (@a7i @ingvagabund has this changed since your thread I linked above?). We could update that to consider all pods, at least for some strategies, which I think would be easier now that we have the descheduler framework in place.

I think there is still merit to the original proposals you linked, and it would be great if there was a standard condition the descheduler could rely on. The scheduler should also take some action to indicate the pod has been failed and remove it from the scheduling queue.

a7i commented 1 year ago

While Descheduler supports Pending pods, there are 2 things to consider:

ingvagabund commented 1 year ago

@kannon92 with introduction of descheduling plugins one can always create a custom plugin for any possible scenario. Even including cases where a pod is not yet scheduled but is expected to be "evicted" (instead of descheduled). Among other reasons we designed and created the descheduling framework to avoid making decisions whether a new scenario can be handled by the descheduler or whether a different component is more preferable. So we can focus more on the mechanics rather than (new) policies.

Quickly reading https://github.com/kubernetes/enhancements/pull/3816 all the mentioned configuration errors are exposed after the kubelet tries to start a container (please prove me wrong). When it comes to evicting pods that are not yet scheduled (as mentioned in https://github.com/kubernetes/kubernetes/issues/113211#issuecomment-1599013555) we need to keep in mind every component has its own responsibilities and ownership of part of a pod lifecycle. The scheduler is responsible for assigning a node to a pod, the kubelet for running a pod, descheduler for evicting a running pod. As @a7i mentioned we have PodLifeTime strategy which could be utilized for the case where a pod is in e.g. FailingToStart state or other for some time. However, if a pod fails to start for a configuration error reason, the corresponding pod spec needs to be updated. Or, a missing secret/configmap needs to be created. Evicting such a pod will not mitigate cause of the configuration error. That's up to a different component (e.g. controllers). So ultimately the descheduler will only "clean" all the broken pods. The descheduler is more interested in cases when the eviction itself will resolve the underlying cause. E.g. moving a pod to a different node where a networking is less broken, a node has more resources to avoid OOM, etc.

kannon92 commented 8 months ago

/cc @alculquicondor

k8s-triage-robot commented 5 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 3 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 3 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/descheduler/issues/1183#issuecomment-2106040918): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
ingvagabund commented 3 months ago

@kannon92 do you still plan to explore this feature?

kannon92 commented 1 month ago

I’m not sure if I’ll get to this. Can we keep it open? There is still interest in pending pod handling and I know @alculquicondor was looking at this at one point as a workaround for some upstream issues around pods being stuck in pending.