resque / resque-scheduler

A light-weight job scheduling system built on top of Resque
MIT License
1.74k stars 481 forks source link

Delayed job processing loop uses wall time to calculate remaining sleep seconds #776

Open Verseth opened 1 year ago

Verseth commented 1 year ago

Looking through the codebase, I noticed that Time.now is used to calculate the remaining time to sleep between the iterations of the delayed job processing loop.

This could result in undefined behaviour, as Time.now reports wall time, which could change and get updated at any point.

This is the code I'm talking about: https://github.com/resque/resque-scheduler/blob/master/lib/resque/scheduler.rb#L416

def poll_sleep_loop
  @sleeping = true
  if poll_sleep_amount > 0
    start = Time.now
    loop do
      elapsed_sleep = (Time.now - start)
      remaining_sleep = poll_sleep_amount - elapsed_sleep
      # etc (...)

If the system's time was updated and moved backward one hour between the two calls to Time.now, the scheduler would sleep for one hour.

One way to mitigate this issue would be to use monotonic time to measure the remaining seconds to sleep like so:

def poll_sleep_loop
  @sleeping = true
  if poll_sleep_amount > 0
    start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
    loop do
      elapsed_sleep = (Process.clock_gettime(Process::CLOCK_MONOTONIC) - start)
      remaining_sleep = poll_sleep_amount - elapsed_sleep
      # etc (...)