uselagoon / remote-controller

A group of controllers for handling Lagoon builds and tasks in Kubernetes or Openshift
5 stars 1 forks source link

Terminate long running builds and tasks after x period #182

Closed shreddedbacon closed 2 years ago

shreddedbacon commented 2 years ago

In some cases a build or task can become stuck running if it encounters an infinite loop in the code, or due to a process waiting for user input.

We should implement a pruner that runs periodically checking for build and task pods that are in a running state, and if they are running for longer than x(to be configurable, with a default of like 6 hours maybe?) interval, then the build or task should be set to failed and we should inject a failure message that says it was terminated due to timeout.

We already have numerous pruner type jobs that run, so this should be fairly easy to build. Injecting a terminated due to timeout message might be the trickier part

bomoko commented 2 years ago

@shreddedbacon - to be clear here, we want to inject the message into the build/task logs, right?