lwolf / kube-cleanup-operator

Kubernetes Operator to automatically delete completed Jobs and their Pods
MIT License
498 stars 109 forks source link

PR Submission #88

Open reddymh opened 2 years ago

reddymh commented 2 years ago

Hi Team,

I have done code changes for the below tasks.

1) Pod(s) which are stuck in Terminating state and requires graceful delete based on age/time 2) Pod(s) which are in Error/ContainerStatusUnknown/OOMKilled/Terminated/Completed(Sometimes running pod changes to completed due to node re-creation/preemptive nodes ) based on age/time

can I raise the PR for the same?

Thanks, Raj

reddymh commented 2 years ago

@lwolf Can I raise PR for the above use case(s)?

lwolf commented 2 years ago

Hi, AFAIR pods stuck in weird states like Terminating/Unknown can't be deleted without using force deleting, othewise they wouldn't be "stuck". Using "force" usually hides the real issue, so I'd prefer to not have things that may result in inconsistent state of the cluster.

Removing Completed pods sounds reasonable. Regarding the others not really sure, but if you already did some coding please share and we can talk more about it

reddymh commented 2 years ago

@lwolf recently we faced issue while pods were stuck in terminating status and some of high priority class pod(s) like calico daemon set was in pending state(pod limit per node) and then we have updated the clean up operator to take care of terminating pods(stuck) by age with graceful delete.

other status like Completed(some pods move to other nodes due to auto scaling up/down but due to some issue pods will go into completed/error state.

I will raise PR for second use case and first use case it will be helpful for scheduler issues or not properly terminate the pods and we can put a flag when required we can enable it.