Kubernetes doesn't set a runtime limit on cronjobs by default, which means that "stuck" cronjobs can run forever.
From an administrator's perspective, there is no point in pods sitting around doing nothing.
From a user's perspective, the stuck pods block any further executions of the cronjob. So it will appear that the cronjob has just stopped running:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal JobAlreadyActive 45m (x28 over 28h) cronjob-controller Not starting job because prior execution is running and concurrency policy is Forbid
IMO Lagoon should set some reasonable time limit on cronjobs so that stuck pods don't sit around forever. This can be done by adding activeDeadlineSeconds to the Job template (docs).
What that reasonable limit is, is up for debate but I'd say something like 2h-4h would be reasonable? This default limit would also need to be added to Lagoon docs.
Kubernetes doesn't set a runtime limit on cronjobs by default, which means that "stuck" cronjobs can run forever.
IMO Lagoon should set some reasonable time limit on cronjobs so that stuck pods don't sit around forever. This can be done by adding
activeDeadlineSeconds
to theJob
template (docs).What that reasonable limit is, is up for debate but I'd say something like 2h-4h would be reasonable? This default limit would also need to be added to Lagoon docs.