Rykian / clockwork

A scheduler process to replace cron.
MIT License
544 stars 66 forks source link

job skipping semantics too subtle and scary? #66

Open jjb opened 3 years ago

jjb commented 3 years ago

the readme says:

If another task is already running at the specified time, clockwork will skip execution of the task with the :at option. If this is a problem, please use the :thread option to prevent the long running task from blocking clockwork's scheduler.

Whoah! That's quite a significant thing to happen silently because another job is running. If threading solves it completely, that's great! But that's also not super clear.

Will one of these always not run?

every(1.week, 'myjob1', :at => 'Monday 16:20')
every(1.week, 'myjob2', :at => 'Monday 16:20')

Will both of these always run?

every(1.week, 'myjob1', :at => 'Monday 16:20', thread: true)
every(1.week, 'myjob2', :at => 'Monday 16:20', thread: true)
winterweird commented 2 years ago

I'll try to answer this to the best of my ability, since I recently butted my head against a similar issue, and maybe it will help someone reading this later :)

Will one of these always not run?

No, both of them may (and very likely will) both run. To explain why, we have to take a look at how Clockwork actually schedules its jobs.

This is a very simplified Ruby-pseudocode explanation of how Clockwork's main loop works (according to my already simplified understanding):

def main_loop
  loop do
    t = Time.now
    tick(t)
    sleep(1)
  end
end

def tick(t)
  # NOTE: Assuming that all events have an 'at:' option -- otherwise, hour/minute is not checked
  events_to_run = events.select { |e| e.run_at.past? && e.hour == t.hour && e.min == t.min }
  events.each do |event|
    event.run
  end
end

The crucial thing to note here is that at every tick, multiple events may be scheduled to run. That is why two events scheduled to run at the same time (e.g. Monday 16:20) will probably (or maybe certainly) be scheduled to run at the same tick.

That doesn't mean that both events will start executing at 16:20. Clockwork is single-threaded by default[^1], which means that if the first job takes an hour, the second won't start executing until 17:20 -- but eventually it will run when the first job finishes.

So when is the :at a problem?

When a running job takes so long to finish that it delays the next tick so much that a job that would be scheduled no longer can be. Consider this example:

every(1.day, "long_job", at: '16:19') do
  sleep(300) # five minutes
end

every(1.day, "short_job", at: '16:20') do
  puts "I'm done"
end

Because these jobs are scheduled at different times, they won't be run at the same tick. Clockwork will happily take the first job, run for 5 minutes (blocking other executions and schedulings in the meantime) and then enter the next tick. Crucially, even though the "short_job" event was already supposed to be scheduled, it cannot be run because the time is now 16:24 (16:19 + 5 minutes). Remember the selection of events_to_run from before: e.run_at.past? && e.hour == t.hour && e.min == t.min. While e.run_at.past? == true and (e.hour == t.hour) == true, (e.min == t.min) == false! The event will not be scheduled to run.

How does :thread fix this? Simple: It prevents "long_job" from occupying the main loop, running in the "background" (separate thread) so that Clockwork is free to schedule more jobs at the correct time. You could use a separate job queuing system to achieve the same thing, which is probably preferable for production, but :thread is a nice simple way of doing it with just Clockwork and no dependencies.

Will both of these always run?

In the spirit of the question, yes. There are some edge cases that are probably not relevant to you:

Something in the first job goes so horribly wrong that it affects the main loop.

Even throwing an exception from a job is not enough to stop Clockwork's main loop, but maybe this could:

every(10.seconds, 'frequent job') { puts 'hi' }
every(30.seconds, 'idiotic idea', skip_first_run: true) { exit(0) }

Output:

➜  clockwork-experiment $ clockwork clock.rb
I, [2022-01-30T12:31:22.368420 #39905]  INFO -- : Starting clock for 2 events: [ frequent job idiotic idea ]
I, [2022-01-30T12:31:22.368684 #39905]  INFO -- : Triggering 'frequent job'
hi
I, [2022-01-30T12:31:22.368739 #39905]  INFO -- : Finished 'frequent job' duration_ms=0 error=nil
I, [2022-01-30T12:31:32.004024 #39905]  INFO -- : Triggering 'frequent job'
hi
I, [2022-01-30T12:31:32.004149 #39905]  INFO -- : Finished 'frequent job' duration_ms=0 error=nil
I, [2022-01-30T12:31:42.001709 #39905]  INFO -- : Triggering 'frequent job'
hi
I, [2022-01-30T12:31:42.001839 #39905]  INFO -- : Finished 'frequent job' duration_ms=0 error=nil
I, [2022-01-30T12:31:52.003444 #39905]  INFO -- : Triggering 'frequent job'
hi
I, [2022-01-30T12:31:52.003506 #39905]  INFO -- : Finished 'frequent job' duration_ms=0 error=nil
I, [2022-01-30T12:31:52.003524 #39905]  INFO -- : Triggering 'idiotic idea'
I, [2022-01-30T12:31:52.003556 #39905]  INFO -- : Finished 'idiotic idea' duration_ms=0 error=nil
➜  clockwork-experiment $

Oops! You ran out of threads.

For a toy example with 2 jobs running weekly, using the default configuration, this is not gonna be a problem. However, you can conceive of a situation it would be:

# example 1: changing the max amount of allowed threads
configure do |config|
    config[:max_threads] = 1
end
every(1.day, 'job1', at: '16:20', thread: true) { sleep(60) }
every(1.day, 'job2', at: '16:20', thread: true) { sleep(60) }

# example 2: many simultaneous jobs
# (by default, Clockwork allows 10 simultaneous threads -- the 11th won't run)
every(1.day, "job1", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job2", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job3", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job4", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job5", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job6", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job7", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job8", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job9", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job10", at: '16:20', thread: true) { sleep(1) }
every(1.day, "job11", at: '16:20', thread: true) { sleep(1) }

[^1]: There may be a "well, technically" argument here (I can see some references to Thread in the source code), but Clockwork will to an outside observer behave as if it's single-threaded for all intents and purposes that matters to this explanation :)