31z4 / zookeeper-docker

Docker image packaging for Apache Zookeeper
MIT License
285 stars 245 forks source link

Modify docker-compose ZOO_SERVERS env #118

Closed diego2glez closed 3 years ago

diego2glez commented 3 years ago

Hi,

The provided docker-compose configuration example in the "How to use this image" section uses the 0.0.0.0:2888:3888 for itself discover. This seems to works fine until the leader goes down and the cluster becomes unnacesible from clients.

Expected behavior

In odd number of nodes cluster (>=3) if the leader shutdowns, the left over nodes elect a new leader and client can connect without problems.

Actual behavior

Using: ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181 ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zoo3:2888:3888;2181 ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181

In odd number of nodes cluster (>=3) if the leader shutdowns, the left over nodes elect a new leader and client can't connect until everything is restarted.

Steps to reproduce the behavior

Using docker-compose configuration provided. Shutdown leader node and try to connect with zkCli.sh (or other client) stucks in "CONNECTING" state. Cluster become unnacesible.

Solution

Replace 0.0.0.0:2888:3888 with the node FQDN (for example zoo1:2888:3888) for each one.

31z4 commented 3 years ago

Hi @diego2glez, thanks for reporting this! 0.0.0.0 was made to address an issue with user-defined network. See https://github.com/31z4/zookeeper-docker/pull/27. Although, I'm not sure if it's still relevan. Could you please add more details about the environment where you see the issue?

diego2glez commented 3 years ago

Hi @31z4

Taking the docker-compose provided in DockerHub and killing leader node, you will be able to reproduce the communication problem "CONNECTING" between client-server.

Version >=20 of Docker includes a new functionality that sets the container hostname as a dns resolution name. This solves the problem of zookeeper random IP binding as you can now use a resolvable hostname 😃

(docker-ce new functionality code: https://github.com/docker/docker-ce/commit/5d4cf4d05b4cd613f8c62fc27a58c0fb5a693cf2#diff-3993b098ff37a3bd60d8b8a6b6a54758998b5a0e5fb8eff757b27e5b02f8607a)

31z4 commented 3 years ago

Thanks @diego2glez and sorry for the delayed response. Documentation will be updated once https://github.com/docker-library/docs/pull/1969 is merged.