timgit / pg-boss

Queueing jobs in Node.js using PostgreSQL like a boss
MIT License
1.72k stars 144 forks source link

Scheduled jobs cannot run more than once per minute #427

Open marklu opened 9 months ago

marklu commented 9 months ago

Regardless of what I set the clockMonitorIntervalSeconds or cronMonitorIntervalSeconds to, the scheduled jobs will not run more than once per minute.

There appears to be a hard-coded minute limiter in timekeeper.

While this may not be the most scalable, is there any other reason why this cannot or should not be overriden to check more than once per minute?

timgit commented 9 months ago

The scheduling section in the docs explains that minute-level precision is the minimum cron precision. Does this answer your question?

marklu commented 9 months ago

I asked this question after reading the scheduling section of the docs.

Even though second-level precision is discouraged due to the amount of times the database needs to be hit, does it make sense to allow for an override of the checking precision to allow for second-level precision?

Would you be open to a pull request that makes it possible to override how often the schedule is checked and thus allow for second-level precision use cases?

timgit commented 9 months ago

What if we cache the schedules once every 30s and then eval each second? It's a compromise that I think most would accept.

joshualyon commented 9 months ago

As a preface, we've been looking at pg-boss for queuing / scheduling and this was one of our concerns as well. When I saw this issue, the approach you mentioned is exactly what I was thinking. Query periodically and hold any items which would occur between now and the next query window in memory and execute them at their appropriate time.

I just wasn't intimately familiar with the codebase and if this was reasonable. In other words, I wasn't sure if the system needed to maintain a 'lock' on each job/entry and if there were limits on how long those locks could be held before being ack'ed. From a quick look at the docs, I suppose this is the job moving from createdactive and how long it's allowed to stay in that state before it transitions to expired and it looks like the default is 15 minutes.

timgit commented 9 months ago

The current implementation uses an internal queue with denouncing hard-coded to 60s. For per-second cron, this would need to be changed to 1s

joshualyon commented 9 months ago

Isn't the internal queue responsible for calling the onCron() function which is grabbing the scheduled items, filtering to schedules that should have run in the last 60 seconds, and sending them to their actual destination queue for work?

If I understand correctly, onCron() currently filters the list of jobs using shouldSendIt() to determine if the schedule is within the time range to send (last 60 seconds) and then 'sends' those tasks.

I was under the impression that you wanted to limit the heavier queries to the database to retrieve the full list of scheduled items, so you wanted to keep the at least 60 second debounce in place... as such, I was thinking something along the lines of changing the way the jobs were enqueued.

In other words, instead of onCron() filtering for schedules which should have already passed and enqueuing those with send(), it could filter to upcoming schedules and then it could use sendAfter() so the enqueued jobs get scheduled with second level precision.

timgit commented 9 months ago

The debounce I'm mentioning is to ensure 2 pgboss instances running cron monitoring can't create more than 1 job within the cron expression.