python-zk / kazoo

Kazoo is a high-level Python library that makes it easier to use Apache Zookeeper.
https://kazoo.readthedocs.io
Apache License 2.0
1.3k stars 387 forks source link

Reduce timeout for the first Connect() request #540

Closed ralt closed 5 years ago

ralt commented 5 years ago

In the case of a zookeeper server under pressure, it will typically try to maintain the quorum rather than handling client requests. In this kind of case, the quorum is maintained, the connection works, but the client is frozen there.

Retrying after a shorter timeout means we can reconnect to another server before losing the session altogether.

ralt commented 5 years ago

Hi,

This was defined in a similar manner as connect_timeout. The idea being that if you have one server, then the timeout can be the full timeout (/ 1). If you have more, then it makes sense to split that, so that the client gets a chance to retry on other servers if the Connect() request fails.

StephenSorriaux commented 5 years ago

You are right, this makes sense.

It is also available in the java client (https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L1383).

Can you please amend your commit in order to follow the CONTRIBUTING.md guidelines?

ralt commented 5 years ago

@StephenSorriaux does it look good now?

StephenSorriaux commented 5 years ago

@ralt That is great! Thanks again for your PR.

ralt commented 5 years ago

Thanks for the merge! :+1: