pravega / zookeeper-operator

Kubernetes Operator for Zookeeper
Apache License 2.0
367 stars 202 forks source link

Cannot scale zookeeper cluster from 3 node into 2 node #575

Open walleliu1016 opened 11 months ago

walleliu1016 commented 11 months ago

Description

I have created a zookeeper with replicas 3, and 3 pods successfully created by operator, and i have check the server with echo srvr |nc localhost 2181, the result indicates the cluster is ok ,with 2 follower and 1 leader.

and then, i change the replicas into 2, the pod-2 will terminate, and later the pod-1 will restarted. the finally situation is cluster is not health, after check the pod-1's logs, the nc result not with imok, so the pod cannot ready. 企业微信截图_1700318569643

and the cluster status indicates pod-1, is reatarting with log: 企业微信截图_17003186239759 企业微信截图_1700318668170

i have check the pod-0 with nc 企业微信截图_17003187977701 server is not ready for handle request, the cluster is broken.

Importance

blocker

Location

(Where is the piece of code, package, or document affected by this issue?)

Suggestions for an improvement

(How do you suggest to fix or proceed with this issue?)

kel00s commented 10 months ago

I haven't check if that is intentionally but please keep in mind that it’s recommended that a ZooKeeper ensemble should have an odd number of machines in it. Quorum defines the rule to form a healthy Ensemble which is defined using a formula Q = 2N+1 where Q defines number of nodes required to form a healthy Ensemble which can allow N failure nodes.