uselagoon / build-deploy-tool

Tool to generate build resources
2 stars 5 forks source link

Lagoon cronjobs can hang forever #327

Open smlx opened 1 week ago

smlx commented 1 week ago

Kubernetes doesn't set a runtime limit on cronjobs by default, which means that "stuck" cronjobs can run forever.

Events:
  Type    Reason            Age                 From                Message
  ----    ------            ----                ----                -------
  Normal  JobAlreadyActive  45m (x28 over 28h)  cronjob-controller  Not starting job because prior execution is running and concurrency policy is Forbid

IMO Lagoon should set some reasonable time limit on cronjobs so that stuck pods don't sit around forever. This can be done by adding activeDeadlineSeconds to the Job template (docs).

What that reasonable limit is, is up for debate but I'd say something like 2h-4h would be reasonable? This default limit would also need to be added to Lagoon docs.