timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.12k stars 158 forks source link

Failed jobs don't clear `delay.reject` timer and cause the timer to run for the default expiration duration of 15 mins #263

Closed aravindanve closed 3 years ago

aravindanve commented 3 years ago

I noticed that pg-boss was causing my tests to hang (after it was fixed in 6.1.0). When I inspected further, I realized that when a job fails, the job expiration timer is not cleared. This causes long running (15m default) timers to pile up in memory. With a huge number of failed jobs, this could crash the application. Not to mention it causes tests to hang for 15m at least if you are testing failure cases.

These lines appear to be causing the issue: https://github.com/timgit/pg-boss/blob/265d034243c6f2c9a536ecb417f0c7598d964d2e/src/manager.js#L174-L178

As you can see:

const result = await Promise.race([promise, reject]) // if the `promise` is rejected

try {
  reject.clear() // this line is unreachable
} catch {}

I would suggest changing the lines to:

return Promise.race([promise, reject]).finally(() => {
  try {
    reject.clear()
  } catch {}
})

Here is the script I used, if you want to see for yourself. https://github.com/aravindanve/pg-boss-shutdown-test/blob/main/index.js

timgit commented 3 years ago

Thanks for catching this. I've published a fix in beta if you would like to confirm it's working before I merge the linked PR.

aravindanve commented 3 years ago

@timgit I can confirm that 6.2.1-beta1 fixes the problem. Thanks!