Closed dgioulakis closed 5 years ago
I believe the problematic or curious line is: https://github.com/perrich/Hangfire.MemoryStorage/blob/eed3fb6a0ebe9d3596a51f35a723a1cb7c0cd4e0/src/Hangfire.MemoryStorage/MemoryStorageConnection.cs#L111
|| q.FetchedAt.Value < timeout
Can anyone explain why this provider is allowing already "fetched" jobs to be re-dequeued? As far as I can tell, FetchNextJobTimeout
is essentially a window of time in which a job must complete before it gets selected again for execution.
Why should the storage connection fetch a job that's already been fetched? Or why are jobs not really dequeued here?
Still digging through; looks like Hangfire only dequeues after the job has completed processing. So you may have jobs that continue processing longer than FetchNextJobTimeout
causing them to be re-executed by an additional concurrent worker process. Something doesn't seem right here.
Same problem referenced here: https://github.com/HangfireIO/Hangfire/issues/1197
Hi, I think it's by design to retrieve not completed job. You can define another timeout for instance 24 hours, by set FetchNextJobTimeout in the MemoryStorageOptions
Hi, Do we have any solution for this? We are using Oracle provider and we see the same issue.
Why would we want to retrieve a job that's currently being processed by a different Worker
?
Looks like this is partially the answer: https://github.com/HangfireIO/Hangfire/issues/936#issuecomment-315087104
Like InvisibilityTimeout
and SlidingInvisibilityTimeout
in the SQL Server storage implementation, FetchNextJob
is really a window of time in which a job MUST complete or else may be started again by another Worker
either concurrently, or upon next dequeue from the storage connection. I just don't understand why this can't be automated by Hangfire given the we can find out a job's state. Perhaps that's not guaranteed?
We no longer have the FetchNextJobTimeout property in version 1.7 (memory storage 1.4), I don't know what to do
Please see: https://github.com/HangfireIO/Hangfire/issues/1025#issuecomment-473707971
Doesn't seem memory storage is actually dequeuing the job or marking it as dequeued (I don't know what the correct behavior should be).
Results
One job enqueue results in multiple worker executions. It's even worse if
FetchNextJobTimeout
is reduced further. I'm not sure whether this is a problem with Hangfire, MemoryStorage, or both.