contribsys / faktory

Language-agnostic persistent background job server
https://contribsys.com/faktory/
Other
5.71k stars 228 forks source link

Clarification around Reservation Expired #409

Closed pbrisbin closed 2 years ago

pbrisbin commented 2 years ago

I had always assumed Jobs that error with Reservation Expired would be given to another consumer. This documentation seems to indicate that,

Yes, any jobs left over by a worker crash will cause Faktory to re-enqueue the job after the job reservation times out. This is treated identical to a FAIL.

As far as I can tell this isn't happening for us. We're seeing expired reservations end up in the Dead list, with retries remaining no less:

This implies they're not being re-enqueued, IMO, which seems like a bug. I'd expect one of two behaviors:

In no case would I expect to see it in Dead with retries remaining, which we are in the above screenshot.

Am I misunderstanding something?


We're on Faktory Enterprise 1.5.5

pbrisbin commented 2 years ago

Oh wait, I think I figured it out. Let me know if this is right...

"Retry Count" is not retries remaining, it's retries originally configured. I've gotten this confused before :facepalm: I think I even asked to have a 1st-class "Retries Remaining" added to the Job data at some point.

"Next Retry" is frozen at whatever was the final next-retry value. You can tell because it matches "Enqueued". This could maybe be considered a bug, but meh.

So, this Job was originally enqueued 5 hours ago with 3 retries. It (probably) had reservation expired errors, was re-enqueued, respected retries, and did this 3 times in the span of an hour. Then, after all that, it moved to Dead -- all as desired.

Sorry for the needing the rubber :duck: Issue here to figure that out.

mperham commented 2 years ago

Your retries_remaining value is in 1.6.

Lynguyen237 commented 1 year ago

@pbrisbin how can I get to the view in your screenshot? I don't seem to have that view at all. When I go to the dead tab, I can't click into each job that is dead to explore more. I am on Faktory Enterprise 1.6.0. Thanks!

pbrisbin commented 1 year ago

@Lynguyen237 for us, the Last Retry column is a link,

It goes to a URL that appears to be,

https://{base}/morgue/{timestamp}|{JID}

Kind of a difficult URL to construct by hand to view a Job. I can't tell you which timestamp it expects because the page only shows relative time values, without even a hover. We're still on an older version, Faktory Enterprise 1.5.5.