Closed todorovicg closed 2 years ago
The following unit test reproduces the bug:
// see https://github.com/raisedapp/Hangfire.Storage.SQLite/issues/38
[Fact, CleanDatabase]
public void Ctor_SetLockExpireAtWorks_WhenResourceIsLockedAndExpired()
{
UseConnection(database =>
{
// add a lock (taken by another process who is now killed) which will expire in 3 seconds from now
var distributedLock = new DistributedLock();
distributedLock.Id = Guid.NewGuid().ToString();
distributedLock.Resource = "resource1";
distributedLock.ExpireAt = DateTime.UtcNow.Add(TimeSpan.FromSeconds(3));
database.Database.Insert(distributedLock);
// try to get the lock in the next 10 seconds
// ideally, after ~3 seconds, the constructor should succeed
using (new SQLiteDistributedLock("resource1", TimeSpan.FromSeconds(10), database, new SQLiteStorageOptions() { DistributedLockLifetime = TimeSpan.FromSeconds(3) }))
{
DistributedLock lockEntry = database.DistributedLockRepository.FirstOrDefault(_ => _.Resource == "resource1");
Assert.NotNull(lockEntry);
}
});
}
@felixclase Could you please have a look at this bug? You can copy/paste the code above into SQLiteDistributedLockFacts.cs to reproduce it. The real-life scenario would be that the process using this library is killed while the lock was hold.
Edit: I think the problem is that the Cleanup()
is called only once before the waiting loop - I think it should be called while in the loop as well, since the expiration might happen while we wait. But I'm not sure of what is the best implementation.
After some testing, I think the Acquire()
logic is flawed. Why would you extend an existing lock (which it's not yours) each time you try to Acquire()
it (because of the call to .Update()
)? This makes it impossible to acquire a lock which is already taken but expiring - we are extending its expiration date artificially each time. And there is already a mechanism in place to extend the lock by its owner (the heartbeat).
Also: the lock in the database has no concept of 'owner'. If it did, the problem above (extending a lock you don't own) would have never happened in the first place.
I'll try to work on this and make a PR.
See #41
Nice find @fnajera-rac-de! I've checked your PR as well, looks good.
@felixclase On an unrelated note I've provided a PR #42 which should resolve an issues that I've ran into once. To not bother with opening a new issue, I'll just leave this comment here.
Thanks guys, I will be testing your changes this weekend
Fix published in version 0.3.1
Still randomly getting distributed lock exception. Hangfire.Storage.SQLite version: 0.3.0 Hangfire.Core version: 1.7.22
Timeout expired. The timeout elapsed prior to obtaining a distributed lock on the 'Could not place a lock on the resource 'HangFire:extension:job-mutex:lock:StatusUpdateJob': The lock request timed out.' resource. at Hangfire.Storage.SQLite.SQLiteDistributedLock.Acquire(TimeSpan timeout) at Hangfire.Storage.SQLite.SQLiteDistributedLock..ctor(String resource, TimeSpan timeout, HangfireDbContext database, SQLiteStorageOptions storageOptions) at Hangfire.Storage.SQLite.HangfireSQLiteConnection.AcquireDistributedLock(String resource, TimeSpan timeout)