Open noorul opened 1 year ago
Here is the actual exception which is causing the node to be removed
{"event": "2023-01-19 15:31:10 [error ] Failed to perform redis health check request_id=dcca0f39-d43c-4b76-9177-4befa4af2ea0\nTraceback (most recent call last):\n File \"/app/main.py\", line 90, in healthz\n future = await redis.ping()\n File \"/.venv/lib/python3.8/site-packages/redis/asyncio/cluster.py\", line 663, in execute_command\n await self.initialize()\n File \"/.venv/lib/python3.8/site-packages/redis/asyncio/cluster.py\", line 370, in initialize\n await self.nodes_manager.initialize()\n File \"/.venv/lib/python3.8/site-packages/redis/asyncio/cluster.py\", line 1281, in initialize\n raise RedisClusterException(\nredis.exceptions.RedisClusterException: Redis Cluster cannot be connected. Please provide at least one reachable node: None", "request_id": "dcca0f39-d43c-4b76-9177-4befa4af2ea0", "logger": "app.main", "level": "error", "process": 43, "@timestamp": "2023-01-19T15:31:10.209442Z"}
{"event": "10.2.91.1:48188 - \"GET /healthz HTTP/1.1\" 500", "request_id": "dcca0f39-d43c-4b76-9177-4befa4af2ea0", "logger": "uvicorn.access", "level": "info", "process": 43, "@timestamp": "2023-01-19T15:31:10.210536Z"}
Traceback (most recent call last):
File "/.venv/lib/python3.8/site-packages/redis/asyncio/cluster.py", line 737, in _execute_command
return await target_node.execute_command(*args, **kwargs)
File "/.venv/lib/python3.8/site-packages/redis/asyncio/cluster.py", line 1009, in execute_command
return await self.parse_response(connection, args[0], **kwargs)
File "/.venv/lib/python3.8/site-packages/redis/asyncio/cluster.py", line 985, in parse_response
response = await connection.read_response()
File "/.venv/lib/python3.8/site-packages/redis/asyncio/connection.py", line 800, in read_response
response = await self._parser.read_response(
File "/.venv/lib/python3.8/site-packages/redis/asyncio/connection.py", line 373, in read_response
await self.read_from_socket()
File "/.venv/lib/python3.8/site-packages/redis/asyncio/connection.py", line 358, in read_from_socket
raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR) from None
redis.exceptions.ConnectionError: Connection closed by server.
This is actually causing problems in production. Can someone please confirm this issue?
Hi, I'm having the same issue
$ pip freeze | grep redis
redis==4.4.2
$ python --version
Python 3.9.15
$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
...
Traceback (most recent call last):
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/cluster.py", line 373, in initialize
await self.nodes_manager.initialize()
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/cluster.py", line 1295, in initialize
raise RedisClusterException(
redis.exceptions.RedisClusterException: Redis Cluster cannot be connected. Please provide at least one reachable node: None
This happens after I get
Traceback (most recent call last):
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/cluster.py", line 744, in _execute_command
return await target_node.execute_command(*args, **kwargs)
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/cluster.py", line 1018, in execute_command
return await self.parse_response(connection, args[0], **kwargs)
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/cluster.py", line 994, in parse_response
response = await connection.read_response()
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/connection.py", line 829, in read_response
response = await self._parser.read_response(
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/connection.py", line 256, in read_response
response = await self._read_response(disable_decoding=disable_decoding)
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/connection.py", line 266, in _read_response
raw = await self._readline()
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/connection.py", line 343, in _readline
raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR)
redis.exceptions.ConnectionError: Connection closed by server.
several times. Unfortunately I cannot provide minimal example as this is part of our test suit.
I don't get why I get the ConnectionError
in the first place, because there should be no problems connecting to the server. When I try it in Python REPL or use redis-cli
I can connect without problems.
I tried to use add condition to https://github.com/redis/redis-py/blob/v4.4.0/redis/asyncio/cluster.py#L744:
if len(self.nodes_manager.startup_nodes) > 1:
self.nodes_manager.startup_nodes.pop(target_node.name, None)
but now it fails also on
Traceback (most recent call last):
File "/opt/venv/lib/python3.9/site-packages/redis/asyncio/cluster.py", line 373, in initialize
await self.nodes_manager.initialize()
GeneratorExit
EDIT: I checked the redis debug logs and it seems that client creates 5 connections and then closes them. No log about server closing the connection as would the ConnectionError
suggests.
8:M 24 Jan 2023 07:29:40.461 - Accepted 172.23.0.3:50644
8:M 24 Jan 2023 07:29:40.473 - Accepted 172.23.0.3:50646
8:M 24 Jan 2023 07:29:40.473 - Accepted 172.23.0.3:50658
8:M 24 Jan 2023 07:29:40.575 - Accepted 172.23.0.3:50668
8:M 24 Jan 2023 07:29:40.677 - Accepted 172.23.0.3:50678
8:M 24 Jan 2023 07:29:41.635 - DB 0: 3 keys (0 volatile) in 4 slots HT.
8:M 24 Jan 2023 07:29:41.635 . 5 clients connected (0 replicas), 1561560 bytes in use
8:M 24 Jan 2023 07:29:41.719 - Client closed connection
8:M 24 Jan 2023 07:29:41.719 - Client closed connection
8:M 24 Jan 2023 07:29:41.719 - Client closed connection
8:M 24 Jan 2023 07:29:41.719 - Client closed connection
8:M 24 Jan 2023 07:29:41.719 - Client closed connection
In my case also, if I manually access Redis, it works.
FYI I was trying this with the newest redis version 4.5.1 and the issue persists.
This issue is marked stale. It will be closed in 30 days if it is not updated.
This issue is still relevant
This is still happening in the production. Is there any update on it ? I'm also facing similar issues - @noorul
I am yet to migrate :(. I am not sure how to get attention from maintainers.
Version: What redis-py and what redis version is the issue happening on?
redis-py version is 4.4.0
Platform: What platform / version? (For example Python 3.5.1 on Windows 7 / Ubuntu 15.10 / Azure)
Description: Description of your issue, stack traces from errors and code that reproduces the issue
I have the following code in the health check API for the application
The following class is the Singleton wrapper for RedisCluster
Kubernetes liveness probe calls /healthz API and fails after some time and logs have the following traceback.
After some debugging, I found that the code https://github.com/redis/redis-py/blob/v4.4.0/redis/asyncio/cluster.py#L744 is getting executed. Once the failed node is removed then that thread never recovers.