Closed Deekor closed 6 months ago
~I believe it doesn't go into retry, it should be directly re-enqueued.~
My mistake, it is handled as a job failure and should go into retries: https://github.com/contribsys/faktory/blob/b3e739a6c10164b3bdd3bf34dda9405964bd4137/manager/working.go#L223
Interesting. I had a long-running job today that definitely didn't.
The reservation time was 3 hours, it sat in busy for 2 hours after the process died (an hour in) and never made it to retry.
I'll look into it next week. If you can give me a simple reproduction, that would help.
I was able to reproduce a simple crashing scenario. The jobs moved from Busy to Retries and then to Enqueued as expected.
You're right. Turns out the job had a short circuit in it (on retries) that I didnt notice. It finishes so fast that i didnt even see it in busy or queue.
Ok, in a similar vein to this. The worker that was running my CampaignStartWorker
job crashed. The job retry short circuited. This is all intended behavior as discussed above.
However, the job still sits in busy state running on a ghost process, the job had a custom reservation and unique_for which is blocking me from re-queueing the same job.. even though the job isn't actually in queue or running on a real process.
faktory_options reserve_for: 10800
faktory_options custom: { unique_for: 3.hours.to_i }
If the entire worker process crashes, you'll see the job sit until the reservation timeout passes. This is because Faktory can't tell if the job is still executing (and the network is bad) or if the process died and its pending jobs can be retried soon.
1.6.2
ruby
I've noticed this behavior: if the faktory worker process crashes (out of memory for example) any jobs that worker was processing sits in
busy
until reservation timeout and then never enters the retry queue. Is this intended behavior? Im worried about jobs being lost if a crash happens, is there a config I'm missing?