ionelmc / python-redis-lock

Lock context manager implemented via redis SET NX EX and BLPOP.
https://pypi.python.org/pypi/python-redis-lock
BSD 2-Clause "Simplified" License
550 stars 78 forks source link

Is there a race condition when a lock is released? #58

Open todofixthis opened 7 years ago

todofixthis commented 7 years ago

Consider a situation with two processes:

Now, suppose that the following sequence of events unfolds:

  1. Process 1 executes the UNLOCK script to release the lock.
  2. Process 2's BLPOP returns.
  3. Before Process 2 can call SETNX, Process 3 calls SETNX, thereby jumping to the front of the queue and making Process 2 very cross indeed.

Am I thinking about this correctly, or is this not actually possible?

ionelmc commented 7 years ago

Hmmm ... yes, it would be possible, good thinking.

If process 3 would be trying to get the lock at the same time it's released, it would be possible.

How likely is this to happen is up for debate :)

grjones commented 6 years ago

This proved to be a real issue for my project, especially when using this for operations that involve high concurrency. What we observed was some of the operations needing the lock would block for the full timeout even though the lock should have already been released by another process.

I just don't think you can use BLPOP for this, although it was a good idea. If you are looking for a more battle-tested solution, I would just use the redis-py lock: https://github.com/andymccurdy/redis-py/blob/master/redis/lock.py

ionelmc commented 6 years ago

@grjones I think you have an entirely different issue ("block for the full timeout"?!?), perhaps open another issue and give details? Redis-py's lock implementation has even worse "fairness".

As for this issue, if anyone has any idea on how to make the release part more fair (iow, disallow clients jumping the queue if they try to acquire exactly when lock is released) please write your thoughts.

ionelmc commented 6 years ago

@todofixthis how about this change: replace the setnx with a script that does a LLEN on the signal list before the setnx - if it's nonempty then would be the same as setnx failing (and would fix your problem).

ionelmc commented 6 years ago

Actually scratch that, llen would most always be 0 (an element is added there on unlock).

I don't really think I can fix this ...

todofixthis commented 6 years ago

Hm. Yeah, I think you're right. The only ways I can think of to get around this require transactions, but transactions are allergic to blocking commands. Hm.

ionelmc commented 6 years ago

Maybe there's a way using redis streams but I have to wonder if this is worth fixing.

If anyone wants to take a stab there's the https://github.com/ionelmc/python-redis-lock/tree/race-on-unlock branch with a test for this.

nicois commented 4 years ago

my implementation uses a sorted set, where the 'score' is the timestamp the client joined the queue and the value is their own unique client key. When the lock is released, each awoken client checks if they have the lowest score, and executes SETNX if they do. If not, they insert their score/key into a second temporary sorted set, wait 2 seconds, then if they are the lowest try a SETNX again. If they are successful, discard any entries from the main sortedset with a score lower than theirs. This allows those at the front to have a short grace period before other clients can jump in, while still preserving FIFO for whoever is listening to the queue.