gjcarneiro / yacron

A modern Cron replacement that is Docker-friendly
MIT License
454 stars 38 forks source link

executionTimeout and killTimeout not working #70

Closed lorenzomorandini closed 1 year ago

lorenzomorandini commented 1 year ago

Description

Trying to use executionTimeout and killTimeout to handle a process that could possibly hang forever.

defaults:
  utc: false
  concurrencyPolicy: Forbid

jobs:
  - name: fetch_emails
    command: |
      sleep 999
    schedule: "* * * * *"
    captureStderr: true
    executionTimeout: 5
    killTimeout: 1

Expected: fetch_emails job should be terminated and a new one should be able to start. Actual:

INFO:yacron:Starting job fetch_emails
INFO:yacron:Job fetch_emails spawned
INFO:yacron:Job fetch_emails exceeded its executionTimeout of 5.0 seconds, cancelling it...
WARNING:yacron:Job fetch_emails: still running and concurrencyPolicy is Forbid
gjcarneiro commented 1 year ago

Your problem is essentially a common pitfall of shell scripts. The documentation says:

The command can be a string or a list of strings. If command is a string, yacron runs it through a shell, which is /bin/bash in the above example, but is /bin/sh by default.

If the command is a list of strings, the command is executed directly, without a shell.

The configuration

command: |
      sleep 999

causes yacron to execute /bin/sh -c "sleep 999". That creates a tree of two processes:

  1. /bin/sh (parent)
  2. sleep 999 (child) Then, when yacron tries to terminate the process, after timeout, it sends a signal to the parent process, /bin/sh, which correctly terminates, but leaves child sleep 999 running, orphaned.

There are multiple ways to fix this issue.

Option 1: put an exec.

command: |
      exec sleep 999

In this case, yacron runs /bin/sh -c "exec sleep 999". But the exec keyword causes the sleep command to replace the parent process, taking its place, so you no longer have parent and child, only parent.

Option 2: tell yacron not to run a shell at all, giving it a list of strings as command:

    command:
      - sleep
      - "999"

This causes yacron to run /usr/bin/sleep 999 directly, no shell involved.

Option 3: modify the shell code to run to handle SIGTERM:

    command: |
      sleep 999 &
      mysleep=$!
      trap "kill $mysleep" TERM
      wait $mysleep