madelson / DistributedLock

A .NET library for distributed synchronization
MIT License
1.74k stars 182 forks source link

Redis timeout on RedLockRelease #191

Open alexandrepepin-boxoffice opened 4 months ago

alexandrepepin-boxoffice commented 4 months ago

We are sometime getting timeouts on releasing the redis lock. What's the best way to handle this so that the lock can be freed up and not stay lock forever

 var redisDistributedLock = new RedisDistributedLock(lockKey, database);
 await using IAsyncDisposable handle = await redisDistributedLock.AcquireAsync();

Exception:

One or more errors occurred. (Timeout awaiting response (outbound=0KiB, inbound=0KiB, 5766ms elapsed, timeout is 5000ms), command=EVAL, next: EVAL, inst: 0, qu: 0, qs: 0, aw: False, rs: ReadAsync, ws: Idle, in: 0, serverEndpoint: genosha-eus.redis.cache.windows.net:6380, mc: 1/1/0, mgr: 10 of 10 available, clientName: pd0sdwk0004XD, IOCP: (Busy=1,Free=999,Min=200,Max=1000), WORKER: (Busy=7,Free=1016,Min=50,Max=1023), v: 2.2.88.56325 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)) Timeout awaiting response (outbound=0KiB, inbound=0KiB, 5766ms elapsed, timeout is 5000ms), command=EVAL, next: EVAL, inst: 0, qu: 0, qs: 0, aw: False, rs: ReadAsync, ws: Idle, in: 0, serverEndpoint: genosha-eus.redis.cache.windows.net:6380, mc: 1/1/0, mgr: 10 of 10 available, clientName: pd0sdwk0004XD, IOCP: (Busy=1,Free=999,Min=200,Max=1000), WORKER: (Busy=7,Free=1016,Min=50,Max=1023), v: 2.2.88.56325 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts) 

System.AggregateException:
   at Medallion.Threading.Redis.RedLock.RedLockRelease+<ReleaseAsync>d__3.MoveNext (DistributedLock.Redis, Version=1.0.0.0, Culture=neutral, PublicKeyToken=12bc08512096ade0: /_/DistributedLock.Redis/RedLock/RedLockRelease.cs:78)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable+ConfiguredValueTaskAwaiter.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Medallion.Threading.Redis.RedLock.RedLockHandle+<DisposeAsync>d__9.MoveNext (DistributedLock.Redis, Version=1.0.0.0, Culture=neutral, PublicKeyToken=12bc08512096ade0: /_/DistributedLock.Redis/RedLock/RedLockHandle.cs:52)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
... Internal code
madelson commented 3 months ago

@alexandrepepin-boxoffice the redis lock has a default expiry of 30s (with regular renewal so long as the lock is held). See these docs.

So I would expect that if you lose connection to your Redis instance the lock would be held for no more than 30s.

You can shorten that time by fiddling with the options, but if you make it too short then you risk losing the lock spuriously due to lack on your app server (app server is too overloaded to renew the lease before it expires).

I'm also curious whether you think the timeouts themselves are strange: is this something that happens rarely and when it does other unrelated Redis requests also time out or is this something that happens every time and only when the library is trying to release?

alexandrepepin-boxoffice commented 3 months ago

Timeouts aren't strange, they happen rarely and they happen in combination with other unrelated redis requests. I wanted to understand how to recover from those errors. If the default expiry is 30s and configurable, that perfectly fits our needs

madelson commented 3 months ago

I'm going to consider this resolved for the time being. It would be great if we had a way to ensure that the lock renewal system was not affected by thread starvation, but that seems difficult especially since StackExchange.Redis itself is async under the hood and might use the default thread pool even if we make efforts not too. So I think for now the guidance should be to set a longer expiry if you plan to exhaust the thread pool, or better yet don't exhaust the thread pool :-).