redis / redis-py

Redis Python client
MIT License
12.45k stars 2.48k forks source link

Connection lost issue since 4.4.0 upgrade #2491

Open sileht opened 1 year ago

sileht commented 1 year ago

Version: What redis-py and what redis version is the issue happening on?

redis-py: 4.4.0 redis: 6.2.3

Platform: What platform / version? (For example Python 3.5.1 on Windows 7 / Ubuntu 15.10 / Azure)

Docker image of python: 3.11.0 on Debian bullseye

Description: Description of your issue, stack traces from errors and code that reproduces the issue

We upgrade an application to redis-py 4.4.0 and got the following traceback a lot. Downgrading to 4.3.5 solves the issue for us.

If you need more information, feel free to ask.

ConnectionResetError: Connection lost
  File "redis/asyncio/connection.py", line 752, in send_packed_command
    await self._writer.drain()
  File "asyncio/streams.py", line 378, in drain
    await self._protocol._drain_helper()
  File "asyncio/streams.py", line 167, in _drain_helper
    raise ConnectionResetError('Connection lost')
ConnectionError: Error UNKNOWN while writing to socket. Connection lost.
  File "starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "xxxxxxx/middlewares/starlette_workaround.py", line 29, in __call__
    await super().__call__(scope, receive, send)
  File "starlette/middleware/base.py", line 106, in __call__
    response = await self.dispatch_func(request, call_next)
  File "xxxxxxx/middlewares/starlette_workaround.py", line 20, in dispatch
    return await call_next(request)
  File "starlette/middleware/base.py", line 80, in call_next
    raise app_exc
  File "starlette/middleware/base.py", line 69, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "starlette/middleware/httpsredirect.py", line 19, in __call__
    await self.app(scope, receive, send)
  File "starlette/middleware/base.py", line 106, in __call__
    response = await self.dispatch_func(request, call_next)
  File "xxxxxxx/middlewares/security.py", line 15, in dispatch
    response = await call_next(request)
  File "starlette/middleware/base.py", line 80, in call_next
    raise app_exc
  File "starlette/middleware/base.py", line 69, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "ratelimit/core.py", line 90, in __call__
    return await self.app(scope, receive, send)
  File "starsessions/middleware.py", line 138, in __call__
    await self.app(scope, receive, send_wrapper)
  File "starsessions/middleware.py", line 157, in __call__
    await load_session(connection)
  File "starsessions/session.py", line 49, in load_session
    await get_session_handler(connection).load()
  File "starsessions/session.py", line 110, in load
    await self.store.read(
  File "xxxxxxx/asgi_session.py", line 28, in read
    value, invalid = typing.cast(tuple[bytes | None, bool], await pipe.execute())
  File "ddtrace/contrib/redis/asyncio_patch.py", line 29, in traced_async_execute_pipeline
    return await func(*args, **kwargs)
  File "redis/asyncio/client.py", line 1377, in execute
    return await conn.retry.call_with_retry(
  File "redis/asyncio/retry.py", line 62, in call_with_retry
    await fail(error)
  File "redis/asyncio/retry.py", line 59, in call_with_retry
    return await do()
  File "redis/asyncio/client.py", line 1215, in _execute_transaction
    await connection.send_packed_command(all_cmds)
  File "redis/asyncio/connection.py", line 763, in send_packed_command
    raise ConnectionError(
chayim commented 1 year ago

@sileht Can you share your connection information / the code you use to connect?

sileht commented 1 year ago

We use a connection url like this: rediss://:XXXXXXXXXXXXXXXXXXXXX@ec2-aaa-bbb-ccc-ddd.compute-1.amazonaws.com:18880.

We use the default pool by creating only one redis client in our app with: connection = redis.asyncio.client.Redis.from_url(url).

All asyncio tasks then use this client, we don't limit the number of connections on purpose, but we actively monitor the number of concurrent connections on the server side. This service is far below the max connection limits of the server (around 30-40 concurrent connections, the limit is set at 400).

dvora-h commented 1 year ago

@sileht Can you provide more info what exactly cause the error - the connection? a call to a certain command? something else?

rhoog commented 1 year ago

Also seeing this when running redis-py 4.4.2. Not sure what is causing it. Also using AWS.

Our docker image is based on: python:3.11 (Debian GNU/Linux 11 (bullseye))

ConnectionResetError: Connection lost
  File "redis/asyncio/connection.py", line 785, in send_packed_command
    await self._writer.drain()
  File "asyncio/streams.py", line 378, in drain
    await self._protocol._drain_helper()
  File "asyncio/streams.py", line 167, in _drain_helper
    raise ConnectionResetError('Connection lost')

ConnectionError: Error UNKNOWN while writing to socket. Connection lost.
  File "nats/aio/subscription.py", line 290, in _wait_for_msgs
    await self._cb(msg)
...
  File "xxxx", line 688, in store
    await self.session.setex(
  File "redis/asyncio/client.py", line 505, in execute_command
    return await conn.retry.call_with_retry(
  File "redis/asyncio/retry.py", line 62, in call_with_retry
    await fail(error)
  File "redis/asyncio/client.py", line 494, in _disconnect_raise
    raise error
  File "redis/asyncio/retry.py", line 59, in call_with_retry
    return await do()
  File "redis/asyncio/client.py", line 480, in _send_command_parse_response
    await conn.send_command(*args)
  File "redis/asyncio/connection.py", line 805, in send_command
    await self.send_packed_command(
  File "redis/asyncio/connection.py", line 796, in send_packed_command
    raise ConnectionError(
jmcbailey commented 1 year ago

We've also started seeing this after upgrading from redis-py 4.3.4 to 4.4.2. (This is with Python 3.9, and the Azure Redis service, running version 6 of the server.)

The exception (or something like it) can be reproduced with the following script and using a local Redis server, if the server is restarted between iterations:

import asyncio

from redis.asyncio import Redis

async def run_async():
    async with Redis(host="localhost", port=6379) as redis:
        for i in range(5):
            print(i)
            try:
                await redis.get("some-key")
            except Exception as e:
                print(f"Exception: {e}")
            # Restart redis server during sleep
            await asyncio.sleep(10)

if __name__ == "__main__":
    asyncio.run(run_async())

If I restart the local Redis server between iterations 2 and 3 the output is the following:

0
1
2
Exception: Connection closed by server.
3
4

If instead of connecting to a local server I use a port-forward to our Azure Redis server, and then stop and restart the port-forwarding process, I get the Error UNKNOWN while writing to socket. Connection lost. error.

Neither exception happens in 4.3.5; both happen in 4.4.0. Also if I run an equivalent script using the sync Redis client, the exception does not happen in either version.

I did some investigating and it seems that in this scenario the next attempt to read from the socket after the server restart will return an empty byte sequence. Prior to 4.4.0 the SocketBuffer._read_from_socket() method specifically checked for this, and raised a ConnectionError. This connection error resulted in a disconnect and re-connect before the connection was returned by the connection pool.

But the SocketBuffer class was removed in 4.4.0 and the new implementation (can_read_destructive()) doesn't check for an empty byte sequence, so the connection pool doesn't detect that the connection is no longer valid and the unhandled ConnectionError then occurs the next time the client tries to write to it.

sileht commented 1 year ago

I continued a bit the investigation, and it looks like the bug only occurs for PythonParser. If I install hiredis I cannot trigger it.

jmcbailey commented 1 year ago

@sileht Yeah, same here (hadn't thought to check that before!). Makes sense since the HiredisParser is checking the returned buffer whereas the PythonParser is not.

elarrat commented 1 year ago

We recently upgraded from python 3.9 to 3.11 and, with that, our redis package to 4.4.2 and then we started facing the very same problem. I tried to upgrade to 4.5.1 but still no good.

our log goes like this:

2023-02-20T21:18:58.174551945Z     await self.redis_client.set(simulation_uuid, core_json, ex=STUDY_TTL_IN_SECONDS)
2023-02-20T21:18:58.174557945Z   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/client.py", line 514, in execute_command
2023-02-20T21:18:58.174561245Z     return await conn.retry.call_with_retry(
2023-02-20T21:18:58.174564245Z            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-02-20T21:18:58.174567245Z   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/retry.py", line 62, in call_with_retry
2023-02-20T21:18:58.174570345Z     await fail(error)
2023-02-20T21:18:58.174574545Z   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/client.py", line 501, in _disconnect_raise
2023-02-20T21:18:58.174577745Z     raise error
2023-02-20T21:18:58.174580745Z   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
2023-02-20T21:18:58.174583845Z     return await do()
2023-02-20T21:18:58.174586745Z            ^^^^^^^^^^
2023-02-20T21:18:58.174589745Z   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/client.py", line 487, in _send_command_parse_response
2023-02-20T21:18:58.174593145Z     await conn.send_command(*args)
2023-02-20T21:18:58.174596045Z   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 808, in send_command
2023-02-20T21:18:58.174599245Z     await self.send_packed_command(
2023-02-20T21:18:58.174602145Z   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 799, in send_packed_command
2023-02-20T21:18:58.174605445Z     raise ConnectionError(
2023-02-20T21:18:58.174608445Z redis.exceptions.ConnectionError: Error UNKNOWN while writing to socket. Connection lost.
sadilet commented 1 year ago

@elarrat Hi, would like to ask have you solved this issue?

gordianberger commented 1 year ago

We had the same issue on 4.5.3. Only downgrading to 4.3.5 mentioned above seems to work

manawasp commented 1 year ago

We encountered the same issue, as said upper the flavor hiredis made us to get around the issue

wbwlkr commented 1 year ago

We encountered the same issue as well, as said @jmcbailey : installing hiredis fixed this issue "Error UNKNOWN while writing to socket. Connection lost".

mcursa-jwt commented 3 months ago

hi, we were encountering this same error on:

both methods (downgrading redis version OR installing hiredis) worked to remove the error. We decided to install hiredis for the production servers.

installing hiredis seems to be the safe solution, as other installed packages might also depend on that specific version of redis (and async-timeout).

JamesHutchison commented 1 month ago

Just want to chime and and it appears I'm getting bit on this using Azure's redis instance

Error UNKNOWN while writing to socket. Connection lost.

Is someone able to clarify what "installed hiredis" means? Like did you only pip install it or is there some process you have to go through to get it hooked up?

Edit:

To answer my own question, it appears to be installed automatically. You can verify with:

>>> from redis.asyncio.connection import DefaultParser
>>> DefaultParser
<class 'redis._parsers.hiredis._AsyncHiredisParser'>