Closed fnajera-rac-de closed 7 months ago
In my case, happens when published to a Linux distribution but not in Windows
From what I read in the past, multi-thread, multi-process access to the database in LiteDB is quite different than SQLite, maybe that is the reason LiteDB might not have the same issues as SQLite regarding the distributed lock part.
@TXRock are you using AcquireDistributedLock in async methods?
@TXRock are you using AcquireDistributedLock in async methods?
I am not using AcquireDistributedLock
, but indeed my jobs are executing async methods.
But not sure how this is related with the heartbeat check.
It goes with
(Hangfire.Storage.SQLite.SQLiteDistributedLock) Unable to update heartbeat on the resource 'HangFire:job:xxx:state-lock'. SQLite.SQLiteException: database is locked
and later on
(Hangfire.Storage.SQLite.SQLiteDistributedLock) Unable to update heartbeat on the resource 'HangFire:job:xxx:state-lock'. The resource is not locked or is locked by another owner.
and could not recover.
See #68 for an internal usage of ThreadLocal which seems incompatible with async.
The "database is locked" is probably a transaction failing and not being retried (haven't investigated that one).
But if you look at the code for the message "The resource is not locked or is locked by another owner" I think you'll find the situation described in the other ticket. I assume SQLiteDistributedLock is used also internally by the library even if you don't have explicit usages of it.
I'll see if I can get some time to add a unit test for this problem (at least in the scenario I found)
My ASP.NET Core 6 app shows this error very often:
Unable to update heartbeat on the resource 'HangFire:xxx'. The resource is not locked or is locked by another owner.
I believe this has to do with #68, and may be gone if that error is fixed.
But regardless of #68, if SQLiteDistributedLock cannot update the heartbeat because of that message, what's the point of keep retrying? I think the timer should be stopped in that case - or at the minimum, mute the error log so that it doesn't show up indefinitely in the logs.