Clean out completed/ failed/ errored pods nightly

Discuss with wider team? This also links into this ticket

Background

these pods hog ips and prevent a node from being drained, clean them up to help keep the cluster in a good state.

https://mojdt.slack.com/archives/C514ETYJX/p1724835696695569

Approach

Create a new maintenance job that runs nightly and cleans up each of our clusters, the code below might be useful, you can swap parallel for xargs if preferred:

kubectl get pods --field-selector="status.phase=Failed,spec" -A --no-headers | awk '{print $2 " -n " $1}' | parallel -j1 --will-cite kubectl delete pod "{= uq =}"

Definition of done

[ ] readme has been updated
[ ] user docs have been updated
[ ] another team member has reviewed
[ ] smoke tests are green
[ ] prepare demo for the team

Reference

How to write good user stories

ministryofjustice / cloud-platform

Clean out completed/ failed/ errored pods nightly #6094

Discuss with wider team? This also links into this ticket

Background

Approach

Definition of done

Reference