Grokzen / redis-py-cluster

Python cluster client for the official redis cluster. Redis 3.0+.
https://redis-py-cluster.readthedocs.io/
MIT License
1.1k stars 316 forks source link

MultiThreading Problem about theadingsafety #397

Open Jackvie opened 4 years ago

Jackvie commented 4 years ago

redis-cluster-py version 2.1.0 redis 3.5.1

  1. 1000 client objects created by default class Redis used 0.0001 second, but when I used RedisCluster, It took 28 seconds
  2. So the project i create only one client object, and i use this object no matter where I get the value in code, by testing. After testing, this object is thread safe
  3. The problem is that when I use the pipeline created by the fastest unique object, there is a problem with the value in multiple threads

code like this :

from django.conf import settings
from concurrent.futures import ThreadPoolExecutor
client = settings.READONLYCONST.r_client
print(client)
pipe = client.pipeline()

def func(task_id):
    try:
        if task_id % 2 ==0:
            info = client.hget('aaabbb', 1)
            print('============OK==========', info)
        else:
            pipe.setex('a', 60, 'bbb')
            pipe.setex('b', 60, 'bbb')
            pipe.execute()
    except Exception as e:
        print(task_id, e, '===')

def test_redis():
    print('BEGIN')
    with ThreadPoolExecutor(100) as executor:
        for i in range(100):
            executor.submit(func, i)
    print('END')

test_redis()

result in console like this:

RedisCluster<172.18.10.8:7101, 172.18.10.8:7102, 172.18.10.8:7103>
BEGIN
============OK========== None
============OK========== None
============OK========== OK
============OK========== None
============OK========== OK
============OK========== None
============OK========== None
END
Grokzen commented 4 years ago

@Jackvie You need to debug this deeper on your side and provide some kind of profiling run of the code to see where you are having this major slowdown. I doubt there is some kind of lock issue per say but what i do expect is that you might get into some timeout issue or similar. The one thing i do not get tho is why you create one pipeline and share it among the different threads. Try to make a seperate pipeline for each thread you create and it might increase your performance a ton and then you keep your client a unique instance across the threads maybe?

Grokzen commented 4 years ago

@Jackvie I attempted to use your script locally on my laptop, i don't have any performance problems or degredations is any way. The script finished in about 0.17 seconds every run w/o any issues so you need to track down the problem becuase it is probably something other then this client library code at this point.

rudyryk commented 3 years ago

Pipeline object is not thread-safe as claimed for redis-py: https://github.com/andymccurdy/redis-py#thread-safety

Probably we're having similar limitation here.

Grokzen commented 3 years ago

@rudyryk What i see there is that it is not smart to create a shared Pipeline object as a singleton and share it across the code and use it at multiple places. I would never do this with the plain redis-py object either so doing this for the cluster version i would not do that either then.