lholden / job_scheduler

A simple cron-like job scheduling library for Rust.
Apache License 2.0
199 stars 34 forks source link

When one job takes long, it causes other jobs to run multiple times afterwards (in same tick() call), violating their interval time #8

Closed Boscop closed 5 years ago

Boscop commented 6 years ago

When I have several jobs that run in certain intervals, when one job takes a long time, the same tick() call that is running this job will run the jobs that would have run in the meantime, and it will also run the jobs multiple times if they would have run multiple times in the meantime (in that same tick() call). I'd prefer if the tick() call would only execute one pending job AND if jobs that should have run in the past were debounced so that they run at most 1 times, and only on the next tick() call.

It would also be more intuitive if each tick() call only ran 1 job at most..

In my use case, I have one job that checks for an update of the client regularly, and when an update is available, it downloads it, extracts the executable from the zip, overwrites the current executable and then sets a flag that is used by the outer while loop to terminate. So after a successful update, the executable should break out of the loop (and it is then restarted automatically (because it's a systemd service with Restart=always)), but because of the current behavior of tick(), after the update finishes, the same tick() call that ran the update checker job also runs all the other jobs that would normally have happened during that time (the download can take quite long), without respecting the interval between those jobs! It just runs them in quick succession, which is not what I want. I want to be able to break out of the loop immediately after the update. It would work if each tick() call only executed at most 1 pending job.

And the jobs that didn't get to run in the meantime while one job is running (not this update job but in general) should be debounced (only ran once, one each, over the next N tick() calls). The specified intervals for each job should not be shortened!

lholden commented 6 years ago

Funny, I have use cases that have the inverse requirement. Occasionally a job will get backed up and I want the system to catch up.

It seems entirely reasonable and expected to be able to debounce requests though.

An interesting quirk in regards to debouncing though.. another user requested that jobs be able to run in the background. Backgrounded + debounce would require synchronizing the status and such.

Seems like when starting up the scheduler you should be able to configure features such as backgrounding and debouncing.

I'll take a look at implementing this in the near future. Likely this weekend or maybe while at RustConf.

Boscop commented 6 years ago

Thanks for the quick reply.

By background do you mean in other threads? I think it should not be done by default. My current use cases are all written by sharing a RefCell without Mutex and I think the single-threaded model should stay the default because it's usually sufficient and the simplest to manage (and for backwards compatibility with code using this crate where the job closures aren't Send + 'static).. E.g. in my cases, it has to stay that way that only one job is running at a time and they all run sequentially..

When you introduce multi-threading, it would force all closures to be Send + 'static, even for use cases that will be configured to run sequentially/single-threaded, requiring Arc<Mutex<SharedData>> :/ Maybe it would make sense to introduce a second ParallelJobScheduler which requires the closures to be Send + 'static but the normal one doesn't have to impose those constraints on the closures..

The debouncing wouldn't be necessary for multi-threading because jobs never have to be delayed by other jobs (but it could be configurable if they should be re-entrant or run sequentially with other instances of the same job).

And the need to run at most 1 job per tick() call also doesn't apply to the multi-threaded version because there wouldn't have to be a tick() call to advance the jobs..

So I think the multi-threading version of the JobScheduler should be thought about separately from the needs / config options that only apply to the single threaded version.

E.g. I would add the config options for "run at most 1 job per tick() call" and "run at most 1 instance of a delayed job, before starting the waiting period again" to the single-threaded JobScheduler first, and then afterwards think about designing a multi-threaded version :)