drfeelngood / resque-batched-job

Resque plugin that understands individual jobs can belong to something bigger than themselves
https://rubygems.org/gems/resque-batched-job
MIT License
49 stars 23 forks source link

Jobs stay in the batch even after they complete #22

Closed jsierles closed 9 years ago

jsierles commented 10 years ago

Quite often I'll find one or two jobs left in the batch which have completed successfully. I have to recreate them so they will finish and fire the after_batch callback.

Is there any reason this could happen? How can I debug further? Thanks!

drfeelngood commented 10 years ago

Hmmm, I've been out of the Resque paradigm for awhile now so it might take me a second to get the ol' gears turning. Have these jobs previously failed and then re enqueued?

jsierles commented 10 years ago

These jobs hadn't failed. They appeared to run successfully but still show up in the batch. In any case, I thought I should mention it, but since have switched to Sidekiq. Thanks!

On Wed, Sep 3, 2014 at 4:38 AM, daniel notifications@github.com wrote:

Hmmm, I've been out of the Resque paradigm for awhile now so it might take me a second to get the ol' gears turning. Have these job previously failed and then re enqueued?

— Reply to this email directly or view it on GitHub https://github.com/drfeelngood/resque-batched-job/issues/22#issuecomment-54245320 .

pnomolos commented 10 years ago

I've also encountered this and tracked it down to some weird encoding issue. A & character was sometimes being encoded as \u0026, but only in the before_enqueue hook, and not in the after_dequeue hook. This meant that when trying to remove the job from the queue it wouldn't come off (because the encoded arguments list is different). I figured out that this was due to JsonGem being used by MultiJson - when I changed to oj everything seemed to work fine. I'm not quite sure why the encoding issue, but that fixed it for me.

nonsensery commented 9 years ago

Ha, I just encountered this as well. The problem seems to be that remove_batched_job has to re-create the job string in order to remove it from the queue.

This is also problematic if a perform method mutates the args in any way, because it will prevent the job from being removed from the queue.

Instead, the batched code should hold onto the original source string and use that to remove the completed job from the queue, or at the very least, it should raise an exception if it was unable to remove the job from the queue.