Grokzen / redis-py-cluster

Python cluster client for the official redis cluster. Redis 3.0+.
https://redis-py-cluster.readthedocs.io/
MIT License
1.1k stars 316 forks source link

in writing big data #409

Closed ucasiggcas closed 3 years ago

ucasiggcas commented 3 years ago

hi,dear if I wanna write 1000000 key-value pairs , how to speed up ? And I see the example in the script, but what's the last sentence meaning ?

 from redis._compat import xrange
from rediscluster import RedisCluster

startup_nodes = [{"host": "127.0.0.1", "port": 7000}]
r = RedisCluster(startup_nodes=startup_nodes, max_connections=32, decode_responses=True)

for i in xrange(1000000):
    d = str(i)
    r.set(d, d)
    r.incrby(d, 1)

thx

Grokzen commented 3 years ago

@ucasiggcas If you mean the line r.incrby(...) that means that redis will increase the integer value in that key by 1 for each time that is called. You can read the docs here for that command https://redis.io/commands/incrby

What you are really looking for in your question, you want the MSET command that you can read about here https://redis.io/commands/mset

The example you look at is not optimized really to push a ton of SET operations to a cluster as it will have to fan it out, so compared to a single node you will have a performance loss. However, you should look at pipelines and how they work. There is a ton of pipeline examples out in the wild if you look for "pipeline redis-py" and you just call SET <key> <value> operations on the pipeline and this client lib will group the commands for each redis node and batch send them evenly and thus speed up the performance a ton if you really have a real-world use-case for sending 1 million keys in a single operation to a cluster.

ucasiggcas commented 3 years ago

thanks in advance, that is, the rediscluster is same as redis in writing a lot of keys ? And what's the meaning of the max_connections in the sentence ? r = RedisCluster(startup_nodes=startup_nodes, max_connections=32, decode_responses=True) thx

Grokzen commented 3 years ago

@ucasiggcas The constructor method of the class RedisCluster in this lib can explain it, as well as the base implementation inside redis-py and the Redis class there. In short it controls how many connections the pool will keep open at a maximum for your class instance towards each individual redis server. In a normal non-clustered instance it would be just to your redis-server instance. But in a clustered environment it means how many towards each node it attempts to connect to.

that is, the rediscluster is same as redis in writing a lot of keys ?

I do not get what you mean by this sentence. A clustered redis works in a extreme different way compared to a normal single redis-server node. It sounds like you need to read up on more of the basics on redis clustering before you continue your dive into it. There is a ton of good documentation in this lib in the docs/ folder about some concepts and issues and considerations you need to know before going down to a cluster. Also check out redis.io and the cluster section there in the docs that describes it in depth as well.