kubernetes-sigs / kueue

Kubernetes-native Job Queueing
https://kueue.sigs.k8s.io
Apache License 2.0
1.32k stars 230 forks source link

Optional garbage collection of finished Workloads #1618

Open woehrl01 opened 7 months ago

woehrl01 commented 7 months ago

What would you like to be added:

I would like to have an manager option to delete workload resources as soon as (or with a ttl) the scheduled Job is finished.

Why is this needed:

I changed a configuration to retain more history of job executions of a cronjob, and the memory consumption of the kueue-manager more than doubled:

Bildschirmfoto 2024-01-19 um 11 57 50

Completion requirements:

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

woehrl01 commented 7 months ago

Not sure if this is related, but I also found out that after that change of keeping history ( and not having a ttl on jobs), Kueue stopped working and showing an insane amount of admitted workloads (using v0.5.2)

Bildschirmfoto 2024-01-19 um 12 16 28

Deleting all the succeeded jobs by hand recovered that.

woehrl01 commented 7 months ago

I guess the admitted workload bug this is fixed by #1654. It would be still nice to remove the workload resource all together.

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 3 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

alculquicondor commented 2 months ago

Are you asking to just delete the Workload but keep the parent Job (or Job CRD)?

woehrl01 commented 2 months ago

@alculquicondor yes. That was the idea. The workload will be deleted eventually, but this would free up etcd storage until ttl of the job crd has been reached.

alculquicondor commented 2 months ago

/retitle Optional garbage collection of finished Workloads

🤔 maybe we can also do this for orphan Workloads #1789

mwysokin commented 2 months ago

/assign

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 month ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/kueue/issues/1618#issuecomment-2263301502): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
kannon92 commented 1 month ago

/reopen

k8s-ci-robot commented 1 month ago

@kannon92: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/kueue/issues/1618#issuecomment-2263856566): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
kannon92 commented 1 month ago

Opening due to the work seems to be in flight.

k8s-triage-robot commented 1 week ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 week ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/kueue/issues/1618#issuecomment-2323035097): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
woehrl01 commented 1 week ago

/reopen

k8s-ci-robot commented 1 week ago

@woehrl01: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/kueue/issues/1618#issuecomment-2323038437): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
woehrl01 commented 1 week ago

/remove-lifecycle rotten