redis / redis-py

Redis Python client
MIT License
12.54k stars 2.51k forks source link

Thread safety in Connection Pool - connections growing #3118

Open joshverma opened 8 months ago

joshverma commented 8 months ago

Version: 3.5.3

Platform: python:3.9 Docker image

Description:

I'm trying to understand the behavior of ConnectionPool. If I have a thread pool of size 20 each trying to get a redis connection from a connection pool, shouldn't they all end up using the same connection object? Since a lock is used to pop and push to the connection pool, there should either be 0 or 1 connections in use by the connection pool, right?

However, this is not the behavior I am seeing. The behavior I am seeing is that sometimes, the pool is empty. At that point, new connections are created and added to the pool. This results in around 20 connections (give or take, it varies), being opened to the pool in total. The number of open connections to the pool slowly grows over time as well, which surprises me since a lock is used to pop/push from the pool.

Has anyone experienced this/can they shed some light on what may be happening?

Thanks in advance!

joshverma commented 8 months ago

I think I've figured it out. First of all the lock is released after popping a connection from the pool, so at that point other threads are allowed to acquire the lock and pop another connection. A similar pattern follows for releasing a connection back to the pool. I misread the code and was under the impression that the lock encompassed the entire lifetime of the sending of the redis command as well. So this explains how many connections are used at once.

However, as for the number of connections slowly growing over time. This is due to the use of thread pools. If each thread in a thread pool asks for a connection, there is a randomness in the order and "interleaving" of each thread's operations.

An example is a thread pool of size 2:

If we extend this logic to a thread pool of size 20, there is more variability in the order and interleaving of the threads now, because there are more of them. The first execution of the code, those 20 threads could perform their operations by only using 12 connections as an example. But maybe on the second execution of the code, those 20 threads are interleaved in such a way that they need 14 connections. If the number of required connections for the thread pool continues to increase, we will see an increase in the total current connections to redis. I suppose there technically would be an upper limit in this case (20 threads), but for complex applications the upper limit may be difficult to determine.

Hopefully this explanation helped someone else in the same boat as me. Or if I am wrong about anything, please feel free to correct me.

James-Leong commented 2 months ago

if you want to limit the number of connections, you can set the max_connections parameter when initializing the ConnectionPool object.

joshverma commented 2 months ago

Yeah that is what I ended up doing, thanks @James-Leong!