In some cases a build or task can become stuck running if it encounters an infinite loop in the code, or due to a process waiting for user input.
We should implement a pruner that runs periodically checking for build and task pods that are in a running state, and if they are running for longer than x(to be configurable, with a default of like 6 hours maybe?) interval, then the build or task should be set to failed and we should inject a failure message that says it was terminated due to timeout.
We already have numerous pruner type jobs that run, so this should be fairly easy to build. Injecting a terminated due to timeout message might be the trickier part
In some cases a build or task can become stuck running if it encounters an infinite loop in the code, or due to a process waiting for user input.
We should implement a pruner that runs periodically checking for build and task pods that are in a
running
state, and if they are running for longer thanx
(to be configurable, with a default of like 6 hours maybe?) interval, then the build or task should be set tofailed
and we should inject a failure message that says it was terminated due to timeout.We already have numerous pruner type jobs that run, so this should be fairly easy to build. Injecting a terminated due to timeout message might be the trickier part