Closed hrishikeshk closed 3 years ago
Another piece of information that might help - This provlem is deeply related to running in docker contaners and name resolution exceptions when some remote container is not alive. So If I connect to remote service using an IP address instead of hostname, Kazoo client is able to successfully recover and re-connect, along with registering the watches again.
Hello,
This seems due to the STOP_CONNECTING
state we are returning in case it is not possible to resolve any of the hostname given:
https://github.com/python-zk/kazoo/blob/cbdc4749edb5879099c1f9b832c055d9eeb52dea/kazoo/protocol/connection.py#L545-L546
Raising the ForceRetryError()
exception instead should solve the problem (or at least trigger the expected behavior). I don't see any cons changing this, so PR is welcomed if you have some time on your side.
It is already merged in https://github.com/python-zk/kazoo/pull/631, so I think it could be closed.
@qmorek Good point, thank you.
I am using DataWatcher recipe to watch a single Node. The client as well as the zookeeper is running within docker containers. If Zookeeper server is not available for some time, and the network error seen is as below -
WARNING:kazoo.client:Connection dropped: socket connection error: None Connection dropped: socket connection error: None WARNING:kazoo.client:Cannot resolve: [Errno -5] No address associated with hostname
INFO:kazoo.client:Zookeeper session closed, state: CLOSED
After this point, even if Zookeeper server is available again and a client.restart() is attempted, a re-connection does not work.
Expected Behavior
On the Zookeeper server becoming available, the client should re-connect, particularly when using the DataWatcher recipe.
Actual Behavior
Even if using the DataWatcher recipe, the client does not connect and the only option is to restart application.
Snippet to Reproduce the Problem
def addWatch(zk): @zk.DataWatch("/log/level") def watch_node(data, stat, event): logger = logging.getLogger() if event == None or (event.type != EventType.CREATED and event.type != EventType.CHANGED) or stat.data_length <= 0: return True data.decode("utf-8") return True
def conn_listener(state): logger = logging.getLogger() if state == KazooState.LOST: logger.fatal('Failed connecting to Zookeeper: Connection state LOST') zk.restart() elif state == KazooState.SUSPENDED: logger.fatal('Connection suspended to Zookeeper...') else: logger.fatal('Re-connected to Zookeeper...')
def connectHelper(connStr): global zk zk = KazooClient(hosts=connStr) zk.start() addWatch(zk) zk.add_listener(conn_listener)
def connectZk(): zkHost = 'zk' zkPort = 2181 connectHelper(zkHost + ':' + zkPort)
Logs with logging in DEBUG mode
WARNING:kazoo.client:Connection dropped: socket connection error: None Connection dropped: socket connection error: None WARNING:kazoo.client:Cannot resolve: [Errno -5] No address associated with hostname
INFO:kazoo.client:Zookeeper session closed, state: CLOSED
Specifications