OptimalBits / bull

Premium Queue package for handling distributed jobs and messages in NodeJS.
Other
15.44k stars 1.43k forks source link

Concurrency: multi server cluster #36

Closed litepoint closed 10 years ago

litepoint commented 10 years ago

Hi,

I have a setup with multiple worker servers. Each running a worker-app instance on each of its cores (using pm2). They take jobs from a bull queue, two types of jobs.

Besides this I have an other applications who creates the jobs.

But, when I restart a server it starts working on a job that is already being worked on. Is this expected or am I doing it wrong :)

manast commented 10 years ago

Not it is not the expected behaviour. When a queue processes a job it locks it until it is ready, the lock expires after a few seconds, so the worker needs to re-take the lock. But if the worker blocks during more time that the lock is kept, then another worker could start processing that same job. We have a unit test for this: https://github.com/OptimalBits/bull/blob/master/test/test_queue.js#L207-L241

litepoint commented 10 years ago

@manast thanks for the quick answer. I guess the worker could be blocking, as we have some very computationally heavy jobs.

Job example: Generate PDF report, including 1000 plots and text. And it can take up to an hour to complete a job.

How can I make sure that the worker retakes the lock, by rewriting the code to be "less blocking" or will that not help?

manast commented 10 years ago

You have to ensure that you never block your worker's event loop more than 5 seconds, and everything should work correctly.

mikaelhm commented 10 years ago

@manast, thank you for the answer. That makes sense.

(I was by mistake logged in to the project account)