project-codeflare / multi-cluster-app-dispatcher

Holistic job manager on Kubernetes
Apache License 2.0
108 stars 63 forks source link

Forceful deletion of remaining pods after cleanup #670

Open metalcycling opened 1 year ago

metalcycling commented 1 year ago

Issue link

599

What changes have been made

This version of the code allows users to provide a termination time that MCAD will use to wait before it forcefully deletes any remaining pods in the system. A cleanup is done first like it has always been done, but if there are still pods left after the wait time, they are forcefully deleted.

Verification steps

Ran code with AppWrapped pods that have a long terminationGracePeriodSeconds and triggered preemption by deleting one of the pods so minAvailable is violated. Before requeuing, this version deletes ALL pods that are not cleaned up.

Checks

openshift-ci[bot] commented 1 year ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from metalcycling. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/project-codeflare/multi-cluster-app-dispatcher/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment