Per redis recommendations within "redlock", locks should implement a maximum number of tries and use exponential fallback algorithms as a best practice. Ignoring that redis-lock retries every 50ms regardless of how long it's been trying (even hours), we've documented a failure case with abandoned user http requests that causes stacking of redis-locks sequentially for each failed request for hours.
Consider a pattern as follows:
function retrieveLockOrGiveUp(key, maxWaitMs) {
return new Promise((resolve, reject) => {
const maxWaitTimer = setTimeout(() => reject(new Error('Could not fetch lock in reasonable amount of time')), maxWaitMs);
const release = await redisLock('jobABC', 5000);
clearTimeout(maxWaitTimer);
resolve(release)
});
}
or consider alternatively that an HTTP call will typically be abandoned by a user if it takes longer than 15 seconds.
What we directly observed were thousands of queued locks being held for 5000ms each because none of them could catch the lock and redis-lock never gives up nor does it have an AbortController / AbortSignal type concept to cancel the retry loop.
Even without the pattern above, a typical HTTP request being guarded by redis-lock is bound to encounter this same issue where perhaps hundreds or thousands of locks will be fetched sequentially despite all of the requesting users having lost their socket connections perhaps hours ago.
Due to an edge case being introduced that caused an excess of backlogged locks to be attempted and failed, we ended up with several machines fully saturated with infinite retry loops due to there being no upper bound on the retry logic here.
I suggest a few new options be added or at least those who are using redis-lock be weary of high-contention environments and consider the use of redlock if you are guarding http requests on long-lived servers.
Per redis recommendations within "redlock", locks should implement a maximum number of tries and use exponential fallback algorithms as a best practice. Ignoring that redis-lock retries every 50ms regardless of how long it's been trying (even hours), we've documented a failure case with abandoned user http requests that causes stacking of redis-locks sequentially for each failed request for hours.
Consider a pattern as follows:
or consider alternatively that an HTTP call will typically be abandoned by a user if it takes longer than 15 seconds.
What we directly observed were thousands of queued locks being held for 5000ms each because none of them could catch the lock and redis-lock never gives up nor does it have an AbortController / AbortSignal type concept to cancel the retry loop.
Even without the pattern above, a typical HTTP request being guarded by redis-lock is bound to encounter this same issue where perhaps hundreds or thousands of locks will be fetched sequentially despite all of the requesting users having lost their socket connections perhaps hours ago.
Due to an edge case being introduced that caused an excess of backlogged locks to be attempted and failed, we ended up with several machines fully saturated with infinite retry loops due to there being no upper bound on the retry logic here.
I suggest a few new options be added or at least those who are using redis-lock be weary of high-contention environments and consider the use of redlock if you are guarding http requests on long-lived servers.