timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.15k stars 160 forks source link

"Singleton queue" does not prevent multiple jobs from becoming active #367

Closed adamhamlin closed 1 year ago

adamhamlin commented 1 year ago

NOTE: I'm ignoring the retry state below for simplicity

singletonKey:

singletonQueue

I think the most common use case for singletonQueue is some kind of event-triggered sync/refresh job: It reads a snapshot of data from the db, then writes something based on it. There may be LOTS of events so you don't want to have LOTS of jobs queued up (because you only need 1 job that starts after the latest event).

But if the jobs from 2 events in close time-proximity become active at the same time, and the 2nd job writes first, now I have an incorrect result.

I am not able to think of a scenario where you'd want the current behavior--that is, to limit to a single job queued but NOT limit to a single job active.

Would it be possible to amend the fetch query to specifically ignore singletonQueue jobs on the queue that have an active singletonQueue counterpart? That way they'll be passed over until the current active job completes.

phips28 commented 1 year ago

I am also interested in having this working:

Expected: Max 1 job in the created state, max 1 job in the active state

But I guess its not working with distributed workers (there are multiple issues about this in general, concurrency per queue on a distributed level)

We would like to have some kind of trailing-debounced job-queue per singletonQueue. Maybe someone did that already with pg-boss.

adamhamlin commented 1 year ago

@timgit fyi I went ahead and opened https://github.com/timgit/pg-boss/pull/368.

I think it's a pretty focused change, and the requisite correlated subquery will only be run for singleton queue jobs and should be able to use the existing job_singletonkeyon index.

fhawkes commented 1 year ago

Wow I would also love to have this for one of our use cases! 🚀

bam4564 commented 1 year ago

But if the jobs from 2 events in close time-proximity become active at the same time, and the 2nd job writes first, now I have an incorrect result.

@adamhamlin could you expound upon this a little bit? I'm personally looking at using singleton queues right now, and I'm not quite understanding the case in which a single job in the created state (i.e. sitting in some queue QueueName) can become two active jobs.

adamhamlin commented 1 year ago

But if the jobs from 2 events in close time-proximity become active at the same time, and the 2nd job writes first, now I have an incorrect result.

@adamhamlin could you expound upon this a little bit? I'm personally looking at using singleton queues right now, and I'm not quite understanding the case in which a single job in the created state (i.e. sitting in some queue QueueName) can become two active jobs.

@bam4564 I feel like the best/clearest explanation would be taking a look at the tests I added in fetchTest.js in the PR. Unlike regular singletonKey, singleton queue doesn't currently enforce anything about the active state, just queued state.

bam4564 commented 1 year ago

But if the jobs from 2 events in close time-proximity become active at the same time, and the 2nd job writes first, now I have an incorrect result.

@adamhamlin could you expound upon this a little bit? I'm personally looking at using singleton queues right now, and I'm not quite understanding the case in which a single job in the created state (i.e. sitting in some queue QueueName) can become two active jobs.

@bam4564 I feel like the best/clearest explanation would be taking a look at the tests I added in fetchTest.js in the PR. Unlike regular singletonKey, singleton queue doesn't currently enforce anything about the active state, just queued state.

Thanks for that. After reviewing that + the docs related to this I think I understand now.

adamhamlin commented 1 year ago

Thanks, @timgit!