Grokzen / redis-py-cluster

Python cluster client for the official redis cluster. Redis 3.0+.
https://redis-py-cluster.readthedocs.io/
MIT License
1.1k stars 316 forks source link

set() return True when got ConnectionError #478

Closed liuguoshun closed 1 year ago

liuguoshun commented 2 years ago

when set() got ConnectionError ,return value should be nil. but still return True.

  1. this is how to get connection.
cluster_nodes = [
    {"host": "10.x.x.x", "port": "6390"},
    {"host": "10.x.x.x", "port": "6391"},
    {"host": "10.x.x.x", "port": "6390"},
    {"host": "10.x.x.x", "port": "6391"},
    {"host": "10.x.x.x", "port": "6390"},
    {"host": "10.x.x.x", "port": "6391"},
]
rc = RedisCluster(startup_nodes=cluster_nodes, decode_responses=True, socket_connect_timeout=3)

2.this is part of my code.

def start_failover_node(node, type):
    if rc.set("%s_fail" % node, lanip, ex=180, nx=True):
        logger.error(
            "got X failover lock,start failover %s with %s error." % (node, type))
        os_cmd = "nohup python3 wmha_failover.py %s %s >/dev/null 2>&1 &" % (
            node, type)
        os.system(os_cmd)
    else:
        monitor_host = rc.get("%s_fail" % node)
        logger.warning(
            "monitor host %s failover node %s with %s error." % (monitor_host, node, type))

3.this is log

2021-06-07 12:05:15,822 - [ERROR] ConnectionError
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/redis_py_cluster-2.1.0-py3.6.egg/rediscluster/client.py", line 626, in _execute_command
    return self.parse_response(connection, command, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/redis-3.5.3-py3.6.egg/redis/client.py", line 915, in parse_response
    response = connection.read_response()
  File "/usr/local/lib/python3.6/site-packages/redis-3.5.3-py3.6.egg/redis/connection.py", line 739, in read_response
    response = self._parser.read_response()
  File "/usr/local/lib/python3.6/site-packages/redis-3.5.3-py3.6.egg/redis/connection.py", line 324, in read_response
    raw = self._buffer.readline()
  File "/usr/local/lib/python3.6/site-packages/redis-3.5.3-py3.6.egg/redis/connection.py", line 256, in readline
    self._read_from_socket()
  File "/usr/local/lib/python3.6/site-packages/redis-3.5.3-py3.6.egg/redis/connection.py", line 201, in _read_from_socket
    raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR)
redis.exceptions.ConnectionError: Connection closed by server.
2021-06-07 12:05:16,073 - [ERROR] got X failover lock,start failover 10.200.30.100_3306 with instance error.
Grokzen commented 1 year ago

@liuguoshun So what happens here is that the log entries you are seeing is the logging put into the except part when executing a command. It dont really means that this error will propagate up to your code directly. When each command is executed by the client, there is a loop with 16 attempts called TTL:s where if the client finds one of a set of well known issues like a failover or simple connection error, it will in some cases try to rebuild the cluster state configuration due to most likely a failover or reshard event has occured on the redis cluster side. Or your network simply might just have a bad hiccup shortly so we can just reattempt again and in most case:s we get a valid connection and a working command executed on attempt 2 or 3 or 10. This behavior is indeed the intended behavior because if you would to track each node in your cluster with MONITOR you would find out that your command most likley have been executed correct on the target node and thus rc.set() will return True as the command executed correct by the cluster, eventually it did.