redis / redis-py

Redis Python client
MIT License
12.52k stars 2.5k forks source link

Async pubsub burns CPU by default. #3208

Open judilsteve opened 4 months ago

judilsteve commented 4 months ago

Version: redis-py 5.0.3, redis 7.0.12

Platform: Python 3.12 on Linux

Description:

In redis-py's asyncio pubsub get_message() function, the timeout kwarg defaults to 0, which means the function by default returns immediately if there is no message available. So if one were to blindly follow the example code, which essentially amounts to:

while True:
    message = await pubsub.get_message()
    if message:
        # Deal with message...  

They will find that the thread running this example code is pegged at 100% usage, constantly burning through that tight loop. To get the behaviour that a sane person would expect (yielding to the event loop until the message is available), they need to explicitly set timeout=None.

This is a terrible default value which resulted in me spending an entire afternoon trying to figure out why my server was pinned at 100% CPU usage, even though there were zero messages going through the pubsub channel. I have since changed my code to use timeout=None, but I really think this should be the default, especially when using the async version of the client (I have not checked to see if the same footgun applies to the synchronous API).

judilsteve commented 4 months ago

I guess at this point changing the default value would be a breaking change for many users. At the very least, can we add some big bold warnings to the documentation?

hoppiece commented 2 months ago

I faced the same problem and wasted a lot of time because of this bug, but thanks to your post I was able to patch it faster. Thanks a lot!

vikyw89 commented 6 days ago

I guess at this point changing the default value would be a breaking change for many users. At the very least, can we add some big bold warnings to the documentation?

wait, what breaking change can we expect by changing this default value to None ?

judilsteve commented 6 days ago

wait, what breaking change can we expect by changing this default value to None ?

Using timeout=None will suspend the coroutine until a pubsub message arrives. If you are using the pubsub loop to do other things periodically, then those things will not run in situations where there is a large time gap between pubsub messages.

vikyw89 commented 6 days ago

I see, that makes sense.