Closed icy closed 4 years ago
When I used dumb-init
as a signal proxy for my script, everything is working well. So I'm probably sure this is a problem/feature of supercronic.
In this configuration, supercronic is in front my script. When I press ^C, dumb-int sends signal to supercronic, but supercronic stucks there . I have to pkill -9 supercronic
to finish everything.
$ dumb-init --verbose -- ./testdump.sh
[dumb-init] Detached from controlling tty, but was not session leader.
[dumb-init] Child spawned with PID 426096.
[dumb-init] Unable to attach to controlling tty (errno=1 Operation not permitted).
[dumb-init] setsid complete.
INFO[2020-04-19T14:38:31+02:00] read crontab: ./supercronic.crontab
DEBU[2020-04-19T14:38:31+02:00] try parse(7): */5 * * * * * * ./cd.sh sys_sleep[0:15] = */5 * * * * * *
DEBU[2020-04-19T14:38:31+02:00] job will run next at 2020-04-19 14:38:35 +0200 CEST job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
INFO[2020-04-19T14:38:35+02:00] starting iteration=0 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
INFO[2020-04-19T14:38:35+02:00] :: sys_sleep: Sleeping 180 second(s). Use 'now' to wake up the sleeping script. channel=stdout iteration=0 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
^C[dumb-init] Received signal 2.
[dumb-init] Forwarded signal 2 to children.
INFO[2020-04-19T14:38:36+02:00] received interrupt, shutting down
INFO[2020-04-19T14:38:36+02:00] waiting for jobs to finish
WARN[2020-04-19T14:38:40+02:00] not starting: job is still running since 2020-04-19 14:38:35 +0200 CEST (5s elapsed) iteration=0 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
WARN[2020-04-19T14:38:45+02:00] not starting: job is still running since 2020-04-19 14:38:35 +0200 CEST (10s elapsed) iteration=0 job.command="./cd.sh sys_sleep" job.position=0 job.schedule="*/5 * * * * * *"
./testdump.sh: line 2: 426098 Killed ./supercronic -debug ./supercronic.crontab
[dumb-init] Received signal 17.
[dumb-init] A child with PID 426096 exited with exit status 137.
[dumb-init] Forwarded signal 15 to children.
[dumb-init] Child exited with status 137. Goodbye.
This is a simple setup and it's working well. I press ^C and dumb-init send correct signal to my script, and my script just exists correctly as expected.
$ dumb-init --verbose -- ./testdump.sh
[dumb-init] Detached from controlling tty, but was not session leader.
[dumb-init] Child spawned with PID 426230.
[dumb-init] Unable to attach to controlling tty (errno=1 Operation not permitted).
[dumb-init] setsid complete.
:: sys_sleep: Sleeping 180 second(s). Use 'now' to wake up the sleeping script.
^C[dumb-init] Received signal 2.
[dumb-init] Forwarded signal 2 to children.
:: rclean: Cleaning up metric file /home/gfg/metrics.txt
:: rclean: Cleaning up metric file /home/gfg/metrics.txt
[dumb-init] Received signal 17.
[dumb-init] A child with PID 426230 exited with exit status 130.
[dumb-init] Forwarded signal 15 to children.
[dumb-init] Child exited with status 130. Goodbye.
This is by design. When you interrupt Supercronic, it stops scheduling new instances of your jobs, but it doesn't interrupt the current ones. If you want to interrupt the jobs themselves, then you should send them a signal.
IIRC, when signalled, dumb-init forwards signals to its descendant process group, so that sends it Supercronic and your children jobs, and those get terminated.
So, to summarize, it's not that Supercronic doesn't "like" to signal your script — it's just that it's not what it does, because as a default it's usually not a great default for a job runner to unconditionally kill everything when you ask it to stop running new jobs. The reason why that's not a great default is because you can simply kill everything yourself if that's what you want, which is what you did here.
As an aside, Supercronic doesn't create a new process group for your jobs, so if you signalled kill -$PID
(note the -
here), then that would signal the entire group, including your jobs.
@krallin May I ask you how I support my script in Docker container?
Let's say I have a docker container that contains my script, which is working well. Now I want to add supercronic in between, so that I can have a cron-liked feature.
What I understand is that supercronic idea is to support not-well-behavior script (e.g, some script doesn't respect sigterm/sigkill). but my script / my job isn't that case.
Thanks so much
I think there's a bit of confusion here. It seems to me that you think Supercronic is a process manager. It isn't — it's a cron runner.
What I understand is that supercronic idea is to support not-well-behavior script (e.g, some script doesn't respect sigterm/sigkill). but my script / my job isn't that case.
No; your understanding is incorrect here.
Supercronic's design is that when it is signalled, it'll stop scheduling new jobs, and wait for existing jobs to exit. Supercronic is not a process manager: it's a job runner. In that sense, it does have an expectation that the jobs it is taked with running will terminate at some point (and if they don't, then those are indeed misbehaved scripts from Supercronic's perspective).
This behavior is not about handling scripts that don't respect signals, it's simply the semantics of what termination means for Supercronic.
Let's say I have a docker container that contains my script, which is working well. Now I want to add supercronic in between, so that I can have a cron-liked feature.
You've provided very little detail about your cron-like feature is, so that's hard to say with certainty.
That said, Supercronic is designed to run periodic jobs, so if your job is actually a daemon that doesn't exit and expects to be signalled to exit, then perhaps running said job through Supercronic isn't what you should be doing?
If you want to mix some periodic tasks along with a daemon process (I'm guessing this is what you need here?), then I would recommend you use a process manager to start your app AND Supercronic as separate processes. Then, when your process manager is signalled, it'll signal your app to shut down, and Supercronic, which will wait for whichever tasks you have scheduled to finish running.
In general, for simplicity, I would strongly recommend just running your app in one container and your periodic tasks in another container that runs Supercronic. That being said, if that isn't an option for you (e.g. because they need to share some temporary files), then using a process manager is the way to go.
I understand that supercronic is not a process manager.
What confused me, actually, the description of the project itself (https://github.com/aptible/supercronic#why-supercronic)
They often don't respond gracefully to SIGINT / SIGTERM, and may leave running jobs orphaned when signaled. Again, this makes sense in a server environment where init will handle the orphan jobs and Cron isn't restarted often anyway, but it's inappropriate in a container environment as it'll result in jobs being forcefully terminated (i.e. SIGKILL'ed) when the container exits.
SIGTERM triggers a graceful shutdown (and so does SIGINT, which you can deliver via CTRL+C when used interactively)
So what is exactly graceful shutdown
here?
SIGTERM triggers a graceful shutdown (and so does SIGINT, which you can deliver via CTRL+C when used interactively)
I'm sorry if this was unclear — it means that Supercronic will stop starting new jobs and wait for exiting jobs to finish. I.e. nothing gets killed.
I think triggering a graceful shutdown
is something else (aka, sending some signal to jobs...) So yes that's confusing me quite a lot.
Thanks a lot for your time, and I'm very sorry that I can't use the tool in my case. The tool is just so great.
I was similarly confused by the description of the project and also thought it would help facilitate the passing of the signals down to the running jobs as well...
@icy were you able to find a good solution for this? I'm currently having issues getting my configuration set up correctly to handle a graceful shutdown of a php script which is being run by cron in a debian based docker container...
@krallin is there a reason why supersonic can't pass these signals on to the jobs it's managing? Maybe this could be an option via a flag?
If term/int signals need to be sent to the child processes as well as supersonic itself, do you have a suggestion of how that would be done in the context of docker? or even further in a kubernetes cluster? I'm trying to test this locally by using docker stop, but eventually I would want this to work in kubernetes when we rollout updates to the pods.
Hi @andersryanc ,
Thanks for your asking. As said in my previous comment, signal processing is important for my tool and I can't use supercronic for that purpose. I use dump-init
and also rewrite the tool to act as a cronjob
, but it relies on a self counter: When the internal counter reaches some limit, the program signals itself to exit.
It's interesting, but the "root" cause for us is to avoid some memory "leak"-like issue, as seen in https://github.com/golang/go/issues/20135 . I suggested an idea here https://github.com/kubernetes/kubernetes/issues/85752 but not many people have the same issue. Well, I have an option to use k8s cronjob, or to rewrite the golang program/supportive script, but I think I decide to keep the golang code more friendly to me, and k8s cronjob is time-based .
Hope that helps and I wish you find your own way soon.
Hello everyone, I'm experiencing an issue that bears some similarity to the current discussion.
@krallin, my perspective is that the lack of signal propagation to child processes isn't optimal. Ideally, such a behavior should be controlled by a configurable flag.
This becomes crucial when dealing with cron tasks that are time-intensive and may need intervention in the form of a SIGTERM signal.
To illustrate, in my specific scenario, I have a task set up to generate and email PDF invoices for a customer list. This task is quite lengthy, hence, when a single invoice is completed, and a SIGTERM signal is received, I'd prefer the task to terminate rather than commence with a new invoice. While I can allow for the completion of a single invoice, waiting for the entire process isn't feasible, especially during a scale-in phase. For instance, with AWS ECS, a task must be terminated within a maximum of 120 seconds!
Therefore, I believe it's essential to have a mechanism that allows sending signals to child processes.
Nevertheless, I appreciate the excellent work you've shared. Thank you!
I start the tool with
supercronic -debug my.crontab
. When I press^C
, I expectSIGTERM/SIGINT
is sent to my script (aka, supercronic would work as a signal proxy). However, supercronic doesn't send any signal to my script, it just waits for my script.My configuration
(
/cd.sh sys_sleep
works perfectly with SIGTERM/SIGINT when I tried its own, see below)Debug logging
My cd.sh signal trap is working