pravega / zookeeper-operator

Kubernetes Operator for Zookeeper
Apache License 2.0
366 stars 203 forks source link

Zookeeper statefulset down #548

Open vipul-06 opened 1 year ago

vipul-06 commented 1 year ago

I have zookeeper installed through helm running in gke but suddenly my pods are down , with replica set of 5 pod currently is one that also keeps on restarting with logs as below

+ source /conf/env.sh
++ DOMAIN=example-new-solrcloud-zookeeper-headless.dev-backend.svc.cluster.local
++ QUORUM_PORT=2888
++ LEADER_PORT=3888
++ CLIENT_HOST=example-new-solrcloud-zookeeper-client
++ CLIENT_PORT=2181
++ ADMIN_SERVER_HOST=example-new-solrcloud-zookeeper-admin-server
++ ADMIN_SERVER_PORT=8080
++ CLUSTER_NAME=example-new-solrcloud-zookeeper
++ CLUSTER_SIZE=5
+ source /usr/local/bin/zookeeperFunctions.sh
++ set -ex
++ hostname -s
+ HOST=example-new-solrcloud-zookeeper-0
+ DATA_DIR=/data
+ MYID_FILE=/data/myid
+ LOG4J_CONF=/conf/log4j-quiet.properties
+ DYNCONFIG=/data/zoo.cfg.dynamic
+ STATIC_CONFIG=/data/conf/zoo.cfg
+ [[ example-new-solrcloud-zookeeper-0 =~ (.*)-([0-9]+)$ ]]
+ NAME=example-new-solrcloud-zookeeper
+ ORD=0
+ MYID=1
+ WRITE_CONFIGURATION=true
+ REGISTER_NODE=true
+ ONDISK_MYID_CONFIG=false
+ ONDISK_DYN_CONFIG=false
+ '[' -f /data/myid ']'
++ cat /data/myid
+ EXISTING_ID=1
+ [[ 1 == \1 ]]
+ [[ -f /data/conf/zoo.cfg ]]
+ ONDISK_MYID_CONFIG=true
+ '[' -f /data/zoo.cfg.dynamic ']'
+ ONDISK_DYN_CONFIG=true
+ set +e
+ [[ -n '' ]]
+ set -e
+ set +e
+ getent hosts example-new-solrcloud-zookeeper-headless.dev-backend.svc.cluster.local
+ [[ 2 -eq 0 ]]
+ grep -q 'server can'\''t find example-new-solrcloud-zookeeper-headless.dev-backend.svc.cluster.local'
+ nslookup example-new-solrcloud-zookeeper-headless.dev-backend.svc.cluster.local
+ echo 'there is no active ensemble'
+ ACTIVE_ENSEMBLE=false
there is no active ensemble
+ [[ true == true ]]
+ [[ true == true ]]
+ WRITE_CONFIGURATION=false
+ [[ false == false ]]
+ REGISTER_NODE=false
+ [[ false == true ]]
+ [[ false == true ]]
+ ZOOCFGDIR=/data/conf
+ export ZOOCFGDIR
+ echo Copying /conf contents to writable directory, to support Zookeeper dynamic reconfiguration
Copying /conf contents to writable directory, to support Zookeeper dynamic reconfiguration
+ [[ ! -d /data/conf ]]
+ echo Copying the /conf/zoo.cfg contents except the dynamic config file during restart
Copying the /conf/zoo.cfg contents except the dynamic config file during restart
++ head -n -1 /conf/zoo.cfg
++ tail -n 1 /data/conf/zoo.cfg
+ echo -e '4lw.commands.whitelist=cons, envi, conf, crst, srvr, stat, mntr, ruok
dataDir=/data
standaloneEnabled=false
reconfigEnabled=true
skipACL=yes
metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
metricsProvider.httpPort=7000
metricsProvider.exportJvmInfo=true
initLimit=10
syncLimit=2
tickTime=2000
globalOutstandingLimit=1000
preAllocSize=65536
snapCount=10000
commitLogCount=500
snapSizeLimitInKb=4194304
maxCnxns=0
maxClientCnxns=60
minSessionTimeout=4000
maxSessionTimeout=40000
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
quorumListenOnAllIPs=false
admin.serverPort=8080\n'
+ cp -f /conf/log4j.properties /data/conf
+ cp -f /conf/log4j-quiet.properties /data/conf
+ cp -f /conf/env.sh /data/conf
+ '[' -f /data/zoo.cfg.dynamic ']'
+ echo Starting zookeeper service
+ zkServer.sh --config /data/conf start-foreground
ZooKeeper JMX enabled by default
Using config: /data/conf/zoo.cfg
2023-04-12 07:25:53,056 [myid:] - INFO  [main:QuorumPeerConfig@174] - Reading configuration from: /data/conf/zoo.cfg
2023-04-12 07:25:53,062 [myid:] - INFO  [main:QuorumPeerConfig@451] - clientPort is not set
2023-04-12 07:25:53,063 [myid:] - INFO  [main:QuorumPeerConfig@464] - secureClientPort is not set
2023-04-12 07:25:53,063 [myid:] - INFO  [main:QuorumPeerConfig@480] - observerMasterPort is not set
2023-04-12 07:25:53,065 [myid:] - INFO  [main:QuorumPeerConfig@497] - metricsProvider.className is org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
2023-04-12 07:25:53,068 [myid:] - ERROR [main:QuorumPeerMain@98] - Invalid config, exiting abnormally
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing /data/conf/zoo.cfg
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:198)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:124)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:90)
Caused by: java.lang.IllegalArgumentException: standaloneEnabled = false then number of participants should be >0
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseDynamicConfig(QuorumPeerConfig.java:711)
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.setupQuorumPeerConfig(QuorumPeerConfig.java:679)
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:507)
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:194)
    ... 2 more
Invalid config, exiting abnormally
2023-04-12 07:25:53,073 [myid:] - INFO  [main:ZKAuditProvider@42] - ZooKeeper audit is disabled.
2023-04-12 07:25:53,076 [myid:] - ERROR [main:ServiceUtils@42] - Exiting JVM with code 2

Anyone having this issue please help me out?

k0nstantinv commented 10 months ago

can confirm the same error zk operator 0.2.15

3 replicas statefulset, zookeeper-1 crashes forever

Starting zookeeper service
+ zkServer.sh --config /data/conf start-foreground
ZooKeeper JMX enabled by default
Using config: /data/conf/zoo.cfg
2023-11-20 11:25:44,803 [myid:] - INFO  [main:QuorumPeerConfig@174] - Reading configuration from: /data/conf/zoo.cfg
2023-11-20 11:25:44,805 [myid:] - INFO  [main:QuorumPeerConfig@435] - clientPort is not set
2023-11-20 11:25:44,805 [myid:] - INFO  [main:QuorumPeerConfig@448] - secureClientPort is not set
2023-11-20 11:25:44,806 [myid:] - INFO  [main:QuorumPeerConfig@464] - observerMasterPort is not set
2023-11-20 11:25:44,806 [myid:] - INFO  [main:QuorumPeerConfig@481] - metricsProvider.className is org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
2023-11-20 11:25:44,808 [myid:] - ERROR [main:QuorumPeerMain@99] - Invalid config, exiting abnormally
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing /data/conf/zoo.cfg
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:198)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:125)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:91)
Caused by: java.lang.IllegalArgumentException: standaloneEnabled = false then number of participants should be >0
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseDynamicConfig(QuorumPeerConfig.java:695)
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.setupQuorumPeerConfig(QuorumPeerConfig.java:663)
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:491)
    at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:194)
    ... 2 more
Invalid config, exiting abnormally
2023-11-20 11:25:44,810 [myid:] - INFO  [main:ZKAuditProvider@42] - ZooKeeper audit is disabled.
2023-11-20 11:25:44,812 [myid:] - ERROR [main:ServiceUtils@48] - Exiting JVM with code 2