django / channels_redis

Redis channel layer backend for Django Channels
BSD 3-Clause "New" or "Revised" License
602 stars 197 forks source link

redis-py >=4.4.0 breaking changes #348

Closed bbrowning918 closed 10 months ago

bbrowning918 commented 1 year ago

This issue is to help with visibility and discussion:

redis-py >=4.4.0 introduces two breaking issues for the test suite and real usage on

  1. test_receive_cancel in both core and sentinel will fail with "TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'" on the response to ZREMRANGEBYSCORE. In usage it has been noted in redis-py on https://github.com/redis/redis-py/issues/2593 affecting normal looking send calls.
  2. test_message_expiry__group_send__one_channel_expires_message can hang indefinitely in the test suite

These affect both unrelated changes/PRs and 4.0.0 in actual usage. The change log for redis-py 4.4.0 does not seem to have a smoking gun and pinning it to 4.3.5 has the test suite passing better.

carltongibson commented 1 year ago

Thanks for writing this up @bbrowning918! 🏅

jajce commented 1 year ago

I believe I tracked down and fixed the first issue you described. I have a PR ready in redis-py repo https://github.com/redis/redis-py/pull/2607 The biggest change from 4.3.5 to 4.4.0 was that they got rid of second buffer SocketBuffer and started using asyncio.StreamReader directly. Which resulted in that odd behaviour when canceling a task executing redis commands

carltongibson commented 1 year ago

Hi @jajce — I had a quick look at your branch in #353. The GHA workflows didn't complete.

Locally it's hanging in test_core.py:

task: <Task pending name='Task-1037' coro=<test_message_expiry__group_send__one_channel_expires_message() done, defined at /Users/carlton/Projects/Django/channels_redis/tests/test_core.py:554> wait_for=<Future pending cb=[Task.task_wakeup()]>>

I haven't had a chance to dig more deeply yet, so I appreciate this is quite superficial — but maybe it gives you a breadcrumb… Thanks!

jajce commented 1 year ago

@carltongibson yeah, unfortunately my PR in redis-py is supposed to fix only test_receive_cancel :(. I am still looking into what is happening with test_message_expiry__group_send__one_channel_expires_message

jajce commented 1 year ago

Ok, investigated a little bit more and will share what I've found so far. Now I am pretty much sure that both of the issues are related to the removal of SocketBuffer https://github.com/redis/redis-py/pull/2418/ . The test_message_expiry__group_send__one_channel_expires_message issue is very well described in this ticket https://github.com/redis/redis-py/issues/2579 . In our test the issue is that even though we time out our receive here: https://github.com/django/channels_redis/blob/89b29ad300704aa09ec9e2e31efad5eec11ec03f/tests/test_core.py#L589-L591 our next receive, which is here https://github.com/django/channels_redis/blob/89b29ad300704aa09ec9e2e31efad5eec11ec03f/tests/test_core.py#L594 will have to wait for 5 seconds https://github.com/django/channels_redis/blob/89b29ad300704aa09ec9e2e31efad5eec11ec03f/channels_redis/core.py#L81 https://github.com/django/channels_redis/blob/89b29ad300704aa09ec9e2e31efad5eec11ec03f/channels_redis/core.py#L422-L424 to pass till we are able to read response from our next redis command. In our case the first command we are sending to redis is our cleanup EVAL script and when we are getting to read response StreamReader waits for 5 seconds to pass till it actually reads response for EVAL command.

So on one hand it is very different behaviour compared to redis v4.3.5 which is an issue, but on the other hand according to the Redis documentation regarding blocking operations it is an intended behaviour and redis client connection is supposed to be blocked till timeout passes or till another client writes something to the list we are trying to pop from.

I still need to understand how SocketBuffer prevented whole connection from being blocked in 4.3.5, which hopefully will lead to a fix.

carltongibson commented 1 year ago

OK, I've marked the failing tests xfail for now in #356.

This allows updating redis-py for 4.5.3, which is a security release.

The current Python 3.11 failures are pending the next redis-py release https://github.com/redis/redis-py/commit/480253037afe4c12e38a0f98cadd3019a3724254 (4.5.4 or greater.) Error in asyncio.timeout before Python 3.11.3.

We have questions here:

I'll leave this issue to track.

carltongibson commented 1 year ago

As per https://github.com/django/channels_redis/pull/358, we're going to skip PY311 tests until the next redis-py release.

DmytroLitvinov commented 1 year ago

@carltongibson , redis-py released 4.5.4. We can revert that change

carltongibson commented 1 year ago

Thanks @DmytroLitvinov. https://github.com/django/channels_redis/commit/513f85901bee6ad2543abe0c9e113e73865218a0

bluesurfer commented 1 year ago

@carltongibson after upgrading to 4.1.0 I got troubles when testing channel layers.

I am using django-channels (specifically the async consumers) and I notice that when I run multiple tests that are using the channel_layer I keep getting TimeoutError randomly. Fun fact: if I run each test one at a time no error occurs.

I suspected there was something that is not flushing properly between tests. Indeed, calling app.channel_layer.flush() seem to solve the problem. Here's a toy test:

@pytest.mark.asyncio
@pytest.mark.django_db(transaction=True)
async def test_consumer1():
    app = TestConsumer()
    communicator = WebsocketCommunicator(app, "")
    connected, _ = await communicator.connect()
    assert connected
    await app.channel_layer.flush()

I am trying to reproduce the issue with something simpler but it looks harder than expected. I struggle to understand what is going on under the hood when mixing up sync_to_async, pytest.mark.django_db(transaction=True) and channel layers.

carltongibson commented 10 months ago

377 added explicit testing against multiple redis-py versions:

 redis46: redis>=4.6,<4.7
 redis50: redis>=5.0,<5.1
 redismain: https://github.com/redis/redis-py/archive/master.tar.gz

Any of those should work. Please open a new issue if you hit problems with one of those versions.