HangfireIO / Hangfire

An easy way to perform background job processing in .NET and .NET Core applications. No Windows Service or separate process required
https://www.hangfire.io
Other
9.44k stars 1.71k forks source link

The timeout elapsed prior to obtaining a distributed lock #584

Open danilovulovic opened 8 years ago

danilovulovic commented 8 years ago

We have multiple jobs and put DisableConcurrentExecution(600) but some operations can last more than 10 mins. In that case, we get following error:

2016-05-31 15:40:02,272 [Worker #8a94a848] ERROR Hangfire.AutomaticRetryAttribute - Failed to process the job '210619': an exception occurred.
Hangfire.Storage.DistributedLockTimeoutException: Timeout expired. The timeout elapsed prior to obtaining a distributed lock on the 'HangFire:DataUploader.Start' resource.
   at Hangfire.SqlServer.SqlServerDistributedLock.Acquire(IDbConnection connection, String resource, TimeSpan timeout)
   at Hangfire.SqlServer.SqlServerDistributedLock..ctor(SqlServerStorage storage, String resource, TimeSpan timeout)
   at Hangfire.SqlServer.SqlServerConnection.AcquireDistributedLock(String resource, TimeSpan timeout)
   at Hangfire.DisableConcurrentExecutionAttribute.OnPerforming(PerformingContext filterContext)
   at Hangfire.Server.BackgroundJobPerformer.InvokePerformFilter(IServerFilter filter, PerformingContext preContext, Func`1 continuation)

If operation lasts less than 10 mins, everything works fine.

RichardSilveira commented 7 years ago

any news on it?

lomaky commented 7 years ago

having the same issue, but it seems to have been fixed in 1.6.15, will update and see if it continue to happens

https://github.com/HangfireIO/Hangfire/releases/tag/v1.6.15

1.6.15 @odinserj odinserj released this on 9 Aug · 24 commits to master since this release

Release Notes

This release contains important fixes for the Hangfire.SqlServer package, which is actively using the sp_getapplock stored procedure to synchronize work between different servers. I've realized that locks shouldn't be awaited on SQL Server's side, because this may lead to SQL Server's connection pool starvation, because each blocked request will block a single worker thread.

When you are using a lot of workers, and there's a contention on few lock resources (like when using the DisableConcurrentExecutionAttribute, batches or many continuations on a single job), all worker threads can be blocked in SQL Server, causing its unresponsiveness and lead to huge amount of timeout exceptions.