vsivsi / meteor-job-collection

A persistent and reactive job queue for Meteor, supporting distributed workers that can run anywhere.
https://atmospherejs.com/vsivsi/job-collection
Other
388 stars 68 forks source link

Organizing different jobs including concurrency #258

Open a4xrbj1 opened 6 years ago

a4xrbj1 commented 6 years ago

Hi Vaughn,

I have the following requirement, hope you have an answer for this question:

A) We do have several different services (meaning different web services, which we call via their API) and for each service we have several jobs.

B) On top of that we do a need concurrent number of jobs of the same service (with let's say a maximum of 5 concurrent jobs). But this concurrent jobs should only be executed if it's for a different user.

For requirement A), would you use a different job queue for each service or how would organize that jobs for the same service are all lined up (queued). So something like jobQueueServiceA, jobQueueServiceB etc?

For requirement B) we can set the concurrency to 5 (jobs) but how can we at the same time control that two jobs for the same user aren't executed in parallel? Would it be best to chain them up, meaning when we create the jobs we do check if there is already a job for that userId and we would use the depends field to wait for the last job to finish?

As always, thanks in advance!

vsivsi commented 6 years ago

Hi, in general this is a synchronization problem. That is, you have multiple concurrent processes potentially competing for an exclusive resource (in this case, the "right" to access an API on behalf of a user).

In general, job-collection (and job schedulers as a class) do not supply synchronization primitives. This is because distributed synchronization is a hard problem, and every application has its own constraints.

But for simple cases, you can fake it by only running one queue and setting concurrency to 1. Then you know that only one job at a time is running.

Your constraint "B" above adds an additional wrinkle, but it can perhaps also be "faked" by setting the ~cargo~ payload parameter of your job-queue to equal 5, and then writing a small amount of worker code to check for duplicate users in the 1-5 jobs received, and if it detects duplicate user requests, to either fail any jobs > 1 (for a single user), or serialize them in the worker code so that they don't conflict.

I don't know enough about to requirements to make a hard recommendation here, but something like the above may help you avoid needing to use something more elaborate like atomic updates to a distributed database for sync. If you do end up needing something like that, I have a locks package for MongoDB that works well (and is used heavily by my other Meteor package file-collection). https://github.com/vsivsi/gridfs-locks

vsivsi commented 6 years ago

Corrected "cargo" to be "payload" in the above.