Closed lihui920913 closed 6 years ago
Hi lihui That make sense. In fact, there is another place should be consider reconnect event. By the way, the lines 268 and 397 in your PR is duplicate, make a little refactor would be great :)
Hi stream, thanks for your fast response. Duplicate code has been resolved. To tell the truth, I'm not quit clear about that place. Maybe I'll consider about it latter.
Here is the issue: Assume S1, S2 and S3 are three vertx applications connecting to the same zookeeper server. S1 registered consumer on address (let's say "addr") using event bus. Then there should be two node in the zookeeper (/io.vertx/asyncMultiMap/vertx.subs/addr/S1 and /io.vertx/cluster/nodes/S1); If the connection between S1 add zkServer is unstable, the session may disconnected causing the two node are removed since the are all ephemeral. After some seconds, the connection restore and only one node((/io.vertx/asyncMultiMap/vertx.subs/addr/S1)will be restored (ZkAsyncMultiMap.restoreSnapshotCache). And event bus will be back to normal. However, after that if S3 disconnect from zookeeper, which will trigger ZookeeperClusterManager.nodeListener.nodeLeft method, which will pick one node to checkSubs, which will further remove all nodes under /io.vertx/asyncMultiMap/__vertx.subs whose nodeId doesn't exist in /io.vertx/cluster/nodes. Looking forward to your replay.