timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.04k stars 157 forks source link

Question: Best way to test scheduled jobs? #210

Closed Ollie1700 closed 3 years ago

Ollie1700 commented 3 years ago

Hi everyone, I'm hoping to get some advice on the best way to test scheduling with pg-boss.

First of all, developing with this library has been an absolute pleasure so far, so thank you for all of the hard work that has gone into this!

Our business use-case is scheduling certain calculations to happen periodically. Specifically, at the end of every day, every week, every month and every year.

This is the code I have so far that schedules each periodical queue:

  await boss.schedule(dailyQueue, '0 0 * * *', { dailyData: 'foobar' }, { /* pg-boss options */ })
  await boss.schedule(weeklyQueue, '0 0 * * 0', { weeklyData: 'foobar' }, { /* pg-boss options */ })
  await boss.schedule(monthlyQueue, '0 0 1 * *', { monthlyData: 'foobar' }, { /* pg-boss options */ })
  await boss.schedule(yearlyQueue, '0 0 1 1 *', { yearlyData: 'foobar' }, { /* pg-boss options */ })

  await boss.subscribe(dailyQueue, calculateDaily)
  await boss.subscribe(weeklyQueue, calculateWeekly)
  await boss.subscribe(monthlyQueue, calculateMonthly)
  await boss.subscribe(yearlyQueue, calculateYearly)

To test each of these, I'm using a 3rd party library called Sinon in order to spoof times in the JS Date object:

const clock = sinon.useFakeTimers({
  now: (new Date('2020-12-31 23:59:55')).getTime(),
  toFake: [ 'Date' ],
  shouldAdvanceTime: true
})

The above function essentially sets the JS Date object at an initial set time, at which point the system clock will increment normally. You can see that I have set the time at 23:59:55 on the 31st December in order to try and test the "end of year" scheduled job, however I am seeing strange results like everything but the yearly job firing (even though 2020-12-31 wouldn't be the end of the week).

My question is: is there a better/correct/recommended way to test scheduling behaviour? When testing with short time increments and without the time spoofing I am seeing expected results, so my best guess is that pg-boss doesn't like the fact that the date is being thrown off (potentially because it's using some other source for its time calculations?)

If anyone could point me in the right direction with this one I would greatly appreciate it. I trust that the pg-boss scheduling logic is correct, but it's some specifics of our business calculations that I need to be able to test on each daily/weekly/monthly/yearly schedule.

Thank you!

timgit commented 3 years ago

Hey there. Thanks for the compliments! pg-boss was built out of necessity for me and my team, and here we are, 4 years later and still using it. Distributed scheduling was recently added, and since I didn't want to reinvent the wheel, all cron parsing is handled by an independent package, cron-parser.

You may find satisfaction in that in regards to testing, but I can give you a bit more details behind how I'm using this package in pg-boss, the entirety of which is located in timekeeper.js.

As a bit of background, there are several time-specific features that already exist, such as throttling, rate limiting and deferral. Deferral, for example, supports concepts such "at least 5 minutes after right now". In order to support this, a database query needs to be able to determine if a job should be fetched, and it only has its own clock to know the answer. Because of this, you won't find references to any clocks from Node.js land. Additionally, to support throttling, I use a similar database tech, a unique constraint + timestamp rounding, in order to know if a job should be throttled or debounced.

The main idea is "when considering a distributed system, the database server's clock is truth". However, since cron-parser has no knowledge of this, I built a way to sync the clock in Node.js with the clock from the database on an interval to keep them as close to each other as possible.

Finally, given this background, the cron-parser will evaluate an expression, limited to minute precision (not to the second), factoring in any clock skew that exists between Node.js runtime and the database host.

One more thing I'll mention is that cron in all systems, not just pg-boss, only works when the schedulers are actively running. On systems that run at second-level precision, such as crontab, they won't fire if the system was offline at "second 0". However, in pg-boss, there are 2 chances per minute when cron is evaluated, so you'd have to have all your "listening" cron instances down for an entire minute in order to miss a scheduled job publish.

Hopefully this adds more details to your test cases or clears up any missing information about how it works.

Ollie1700 commented 3 years ago

Thanks so much for such a comprehensive reply @timgit !

Based on your descriptions I did some digging into the timekeeper.js file and noticed that it relies on Date.now() to determine whether an event should be sent. So using a bit of trickery as to the order of how things are set up I've actually managed to get it working as expected for my tests! There are probably still issues with this but I'd be interested to hear your thoughts on whether this actually works or if I happen to just be seeing expected results because of some other factor.

The actual "solution" is plain and simple and just involves the order in which actions occur (as well as "tricking" pg-boss into not paying attention to the out of sync clock by setting clockMonitorIntervalMinutes to a high number).

const initScheduledCalculations = async () => {
  console.log('Initialising...')

  const boss = new PgBoss({
    connectionString: '...',
    // Ignore clock monitoring for a while...
    clockMonitorIntervalMinutes: 10
  })

  // Setup all scheduled queues and subscribes
  ...
}

// Entry point
;(async () => {

  // Initialise everything before faking the clock time so pg-boss doesn't recognise the initial skew
  await initScheduledCalculations()

  // Setup the fake clock
  const clock = sinon.useFakeTimers({
    now: (new Date('2020-12-31 23:59:55')).getTime(),
    toFake: [ 'Date' ]
  })

  // Tick the clock and monitor the time
  setInterval(() => {
    clock.tick(1000)
    console.log((new Date()).toISOString())
  }, 1000)

})()

With the above code I was able to see my daily, monthly and yearly queues correctly fire at around Jan 1st 00:00 (within the 30 second check period).

As it stands, I think this is good enough for my testing purposes. Hopefully this helps someone in the future too. I'm also keen to hear any thoughts as to why this is a good/bad/improvable approach.

Thanks again @timgit 😃