django / channels_redis

Redis channel layer backend for Django Channels
BSD 3-Clause "New" or "Revised" License
587 stars 197 forks source link

Connection.disconnect got Future attached to a different loop when sending a message in a django ASGI view #327

Open bellini666 opened 2 years ago

bellini666 commented 2 years ago

Just upgraded my project to 4.0.0b1 (with also updating channels/daphne/etc) and started to get this error after doing a layer.group_send inside a django ASGI view (i.e. not a channels consumer):

Task exception was never retrieved
future: <Task finished name='Task-153' coro=<Connection.disconnect() done, defined at /home/bellini/dev/2u/backend/.venv/lib/python3.10/site-packages/redis/asyncio/connection.py:822> exception=RuntimeError("Task <Task pending name='Task-153' coro=<Connection.disconnect() running at /home/bellini/dev/2u/backend/.venv/lib/python3.10/site-packages/redis/asyncio/connection.py:837>> got Future <Future pending> attached to a different loop")>
Traceback (most recent call last):
  File "/home/bellini/dev/2u/backend/.venv/lib/python3.10/site-packages/redis/asyncio/connection.py", line 837, in disconnect
    await self._writer.wait_closed()  # type: ignore[union-attr]
  File "/usr/lib/python3.10/asyncio/streams.py", line 344, in wait_closed
    await self._protocol._get_close_waiter(self)
RuntimeError: Task <Task pending name='Task-153' coro=<Connection.disconnect() running at /home/bellini/dev/2u/backend/.venv/lib/python3.10/site-packages/redis/asyncio/connection.py:837>> got Future <Future pending> attached to a different loop

It seems that this is scheduled from the Connection.__del__ method.

obs. This does not break the view itself, but I see that message everytime. e.g. I have a post_save signal that will do a group_send to inform some listeners that the object has changes. When saving the object inside django admin or even in another view the group_send is executed correctly, the message is received by the listeners correctly and the view returns the response correctly, but I always see that traceback in my terminal

bbrowning918 commented 2 years ago

Does 4.0.0b2 also behave the same way? It has a few changes/fixes/improvements from b1 that look familiar to the above.

bellini666 commented 2 years ago

Hi @bbrowning918 ,

O sorry, that issue is in b2 actually (I got confused with channels itself, which is b1)

carltongibson commented 2 years ago

Thanks for the report @bellini666.

It's a cleanup warning... there's task not finished when the event loop shuts down. (You're running this under WSGI?)

A minimal reproduction would be handy.

carltongibson commented 2 years ago

The trick will be to give the connection a chance to close before finishing.

bellini666 commented 2 years ago

Hi @carltongibson ,

I'm running this under ASGI, it works both with runserver using daphne and also when running gunicorn with uvicorn worker.

I mentioned this in this PR which fixes a similar issue for PubSub layer. In there the issue is really giving it a chance to close, but in this case it is strange that the loops are indeed not the same.

I did print(id(asyncio.get_running_loop())) before calling group_send and the same print inside the close method that calls the disconnect function, and they are indeed different somehow.

carltongibson commented 2 years ago

@bellini666 — it would be handy if you could put together an absolutely minimal Django project that triggers this. (Single view etc.)

Different loop... — I need to see how/where you're creating the channel layer. 🤔

bellini666 commented 2 years ago

Hi @carltongibson ,

The test case you wrote for https://github.com/django/channels_redis/pull/329 should also trigger this when using the core layer instead of the pubsub

If not, I can try to create a minimal example. But I found both issues using the exact same code, the difference is that the core layer would give this traceback and the PubSub one would give that warning from that other issue I opened

carltongibson commented 2 years ago

@bellini666 yes, it would likely work 🤔 I didn't try it yet, for want of a fix.

My plan is to potter on it after the Channels release, but if you have ideas let me know.

Thanks!

aerickson1337 commented 1 year ago

While I was looking around with a similar issue to bellini I think I found a possible culprit for the error message?

I was tracing down the future error message and it seems like it all comes back to the block in the op: Connection.__del__ in redis.asyncio.connection

I monkey patched it to the following and that fixed the error message:

    def __del__(self):
        try:
            if self.is_connected:
                loop = asyncio.get_running_loop()
                coro = self.disconnect()
                if loop.is_running():
                    # original method call: loop.create_task(coro)
                    loop.run_until_complete(coro)
                else:
                    loop.run_until_complete(coro)
        except Exception:
            pass

The error that caused this seems kind of silly as I was following the future I found this in asyncio.events.AbstractEventLoop:

    def create_task(self, coro, *, name=None):
        raise NotImplementedError

*it's probably not related to this followed the methods wrong.

Not sure if bellini wants to try monkey patching redis.asyncio.connection.Connection.__del__ to not call the not implemented method and see if that fixes for him, but it fixed it for me. Not sure the best way forward here but I need to subclass Connection in my case anyway so subclassing a little more probably won't kill me. I have a post on how/why I was doing that in this issue comment if it helps anyone: comment

aerickson1337 commented 1 year ago

Doing some more testing that monkey patch seems to have stopped those future errors for me consistently (which is honestly weird). I can't reliably reproduce #332 but I'll seldomly get that error when testing, just without the future: ... error after doing the monkey patch to the Connection.__del__ dunder.