Open saffroy opened 5 years ago
If I change my code to always call zk.close()
before zk.start()
then the problem disappears: the socketpairs don't get clogged with unread bytes, as they are closed and re-created on every connection attempt.
I think the bug is that, on a failure, KazooClient.start()
itself calls self.stop()
but not self.close()
. As we have multiple failed connection attemps, bytes accumulate in the ConnectionHandler
socketpair, which eventually is full.
Adding this call to close()
solves the problem for me, I'll do a PR.
BTW the problem is even easier to reproduce by trying to connecto to a non-existent ZK server (e.g. use localhost:12345
).
PR opened in #579
Hello,
I run the test program below, and after it connects to ZooKeeper, I block access to port 2181 with iptables. After a while (~90 sec) and a number of retries, the test program gets stuck: it blocks writing to a socketpair that seems full (it seems no thread reads from it).
Test program:
iptables script:
I also run the script under python3-dbg so I can attach gdb to it, and obtain thread backtraces:
While the program runs, I monitor the state of its socketpairs with ss:
When the program blocks, ss output always ends up in this state:
Anything wrong in my program? Or is it a bug in Kazoo?