Merge TooManyRestart and PodLifeTime

a7i commented 1 year ago

https://github.com/kubernetes-sigs/descheduler/pull/1165#issuecomment-1589662457

damemi commented 9 months ago

Didn't want to derail #1341 more, but also include RemoveFailedPods in the discussion here

There is a lot of overlap between RemoveFailedPods and PodLifetime when you use the states/reasons and maxPodLifetimeSeconds args. They are functionally equivalent at that point.

Additionally, it's not clear what is a "state" and what is a "reason", and it becomes more confusing when one strategy evicts based on container status, while another evicts based on pod status.

So I would propose that we merge these strategies together, but also clearly delineate the pod/container phases that are supported arguments. I think we should also rename the new strategy to reflect that this is eviction based on status (something like PodStatusPhase). So it would look like:

podStatusPhase:
  maxLifetimeSeconds: 60
  podStatuses:
  - Running
  - Error
  containerStatuses:
  - CreateContainerConfigError

By including maxLifetime and Running, this covers the functionality of PodLifetime. By including both pod statuses and container statuses, it covers the functionality of RemoveFailedPods. A maxRestarts argument would also cover TooManyRestarts.

On the other hand, this starts to build a pattern where pretty much any strategy could be merged into one big strategy with different arguments. At that point, we've just created a DeschedulerPolicy config. So, we want to avoid that, but we should look at what other strategies could be merged too. Maybe the Policy config could be more declarative toward defining the type of pod rather than defining the strategy that looks for pods.

When we do merge strategies, we should also alias the old strategy names for at least a couple releases with a warning to use the new strategy. This should be doable with the framework by wrapping the new strategy's methods in the old strategy's.

k8s-triage-robot commented 6 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 5 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 4 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/descheduler/issues/1169#issuecomment-2174215377): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / descheduler

Merge TooManyRestart and PodLifeTime #1169