Create a new maintenance job that runs nightly and cleans up each of our clusters, the code below might be useful, you can swap parallel for xargs if preferred:
kubectl get pods --field-selector="status.phase=Failed,spec" -A --no-headers | awk '{print $2 " -n " $1}' | parallel -j1 --will-cite kubectl delete pod "{= uq =}"
Discuss with wider team? This also links into this ticket
Background
these pods hog ips and prevent a node from being drained, clean them up to help keep the cluster in a good state.
https://mojdt.slack.com/archives/C514ETYJX/p1724835696695569
Approach
Create a new maintenance job that runs nightly and cleans up each of our clusters, the code below might be useful, you can swap parallel for xargs if preferred:
Definition of done
Reference
How to write good user stories