nesquena / backburner

Simple and reliable beanstalkd job queue for ruby
http://nesquena.github.com/backburner
MIT License
428 stars 68 forks source link

Each thread in ThreadsOnFork worker should get its own connection #47

Closed thcrock closed 10 years ago

thcrock commented 11 years ago

I've noticed a problem with the ThreadsOnFork worker once the job queue goes empty. If there is a thread trying to reserve a job on an empty queue, it holds the mutex so no other communication can happen on that connection (note: the mutex code is in beaneater, though the mutex code itself is not necessarily a bug, IMO). This is problematic if another thread has just reserved a job and is trying to process said job. Most importantly, in order to do actual job processing, the 'stats-job' is run to retrieve the ttr from beanstalkd, and if this job is held up then the already-reserved jobs will fail to process, and you're deadlocked.

The only way I've seen this deadlock broken is when beanstalkd eventually returns 'DEADLINE_SOON' to the blocking reserve jobs, by default 120s later. This is pretty terrible latency, however.

You can alleviate this by having more connection addresses defined in your backburner config; beaneater will use these as a pool, but you'll still get collisions where two threads are trying to use the same connection. Really, no two threads should ever try to use the same connection/socket if a blocking reserve is involved.

An easy way to reproduce this is to instantiate a ThreadsOnFork worker for a particular tube with n (>= 2) max threads, and then queue m (n < m < 2n) jobs onto that tube. The first n jobs should process immediately, and the remaining ones should get hung in a state where they have technically been reserved, but can't communicate with the beanstalkd server so they won't process for 120 seconds (unless you've overridden ttr). In my particular case, I had 10 threads and 15 jobs.

nesquena commented 11 years ago

Yep, thanks for bringing this up and also tinkering on a fix. Look forward to pulling that in to make this worker more reliable and efficient.