Closed albal closed 3 years ago
Hi, Could you check you zookeeper is up and running ?
I followed the guide, deploying in GKE, and got no issues. I did the following ?
$ helm install zookeeper bitnami/zookeeper \
--set replicaCount=3 \
--set auth.enabled=false \
--set allowAnonymousLogin=true
Wait until zookeeper is up:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
zookeeper-0 1/1 Running 0 7m56s
zookeeper-1 1/1 Running 0 7m56s
zookeeper-2 1/1 Running 0 7m56s
Deploy kafka:
$ helm install kafka bitnami/kafka \
--set zookeeper.enabled=false \
--set replicaCount=3 \
--set externalZookeeper.servers=zookeeper.default.svc.cluster.local
This is the log I got:
...
[2021-05-27 09:57:31,089] INFO jute.maxbuffer value is 4194304 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2021-05-27 09:57:31,097] INFO zookeeper.request.timeout value is 0. feature enabled= (org.apache.zookeeper.ClientCnxn)
[2021-05-27 09:57:31,100] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2021-05-27 09:57:31,123] INFO Opening socket connection to server zookeeper.default.svc.cluster.local/10.171.248.11:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2021-05-27 09:57:31,130] INFO Socket connection established, initiating session, client: /10.168.2.8:54432, server: zookeeper.default.svc.cluster.local/10.171.248.11:2181 (org.apache.zookeeper.ClientCnxn)
[2021-05-27 09:57:31,151] INFO Session establishment complete on server zookeeper.default.svc.cluster.local/10.171.248.11:2181, sessionid = 0x200005a23fe0000, negotiated timeout = 18000 (org.apache.zookeeper.ClientCnxn)
[2021-05-27 09:57:31,163] INFO [ZooKeeperClient Kafka server] Connected. (kafka.zookeeper.ZooKeeperClient)
[2021-05-27 09:57:31,338] INFO [feature-zk-node-event-process-thread]: Starting (kafka.server.FinalizedFeatureChangeListener$ChangeNotificationProcessorThread)
[2021-05-27 09:57:31,355] INFO Feature ZK node at path: /feature does not exist (kafka.server.FinalizedFeatureChangeListener)
[2021-05-27 09:57:31,356] INFO Cleared cache (kafka.server.FinalizedFeatureCache)
[2021-05-27 09:57:31,561] INFO Cluster ID = 2rSbK1jxQb6vSPzFYP810w (kafka.server.KafkaServer)
...
user@cap:~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox 0/1 Error 0 13h
kafka-0 0/1 CrashLoopBackOff 5 5m50s
kafka-1 0/1 CrashLoopBackOff 5 5m50s
kafka-2 0/1 CrashLoopBackOff 5 5m50s
zookeeper-0 1/1 Running 0 6m57s
zookeeper-1 1/1 Running 0 6m57s
zookeeper-2 1/1 Running 0 6m57s
I tore is down and followed your instructions (which seem the same) and still the same error:
[2021-05-27 10:18:16,679] INFO jute.maxbuffer value is 4194304 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2021-05-27 10:18:16,684] INFO zookeeper.request.timeout value is 0. feature enabled= (org.apache.zookeeper.ClientCnxn)
[2021-05-27 10:18:16,686] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2021-05-27 10:18:16,701] INFO Opening socket connection to server zookeeper.default.svc.cluster.local/:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2021-05-27 10:18:22,688] INFO [ZooKeeperClient Kafka server] Closing. (kafka.zookeeper.ZooKeeperClient)
[2021-05-27 10:18:34,697] WARN Client session timed out, have not heard from server in 18012ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn)
[2021-05-27 10:18:34,805] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
[2021-05-27 10:18:34,807] INFO EventThread shut down for session: 0x0 (org.apache.zookeeper.ClientCnxn)
[2021-05-27 10:18:34,812] INFO [ZooKeeperClient Kafka server] Closed. (kafka.zookeeper.ZooKeeperClient)
[2021-05-27 10:18:34,816] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:271)
at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:267)
at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:125)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1948)
at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:431)
at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:456)
at kafka.server.KafkaServer.startup(KafkaServer.scala:191)
at kafka.Kafka$.main(Kafka.scala:109)
at kafka.Kafka.main(Kafka.scala)
[2021-05-27 10:18:34,818] INFO shutting down (kafka.server.KafkaServer)
[2021-05-27 10:18:34,827] INFO App info kafka.server for 0 unregistered (org.apache.kafka.common.utils.AppInfoParser)
[2021-05-27 10:18:34,828] INFO shut down completed (kafka.server.KafkaServer)
[2021-05-27 10:18:34,828] ERROR Exiting Kafka. (kafka.Kafka$)
[2021-05-27 10:18:34,837] INFO shutting down (kafka.server.KafkaServer)
I am running RKE deployed through Rancher on ESX/vCenter 7 VMs - 3 workers and one master.
Oh I see the issue - it is trying to use my public IP to connect (redacted).
When I try to ping (to get an ip from name) zookeeper from a busybox instance I get the local svc IP
kubectl run -i --tty busybox --image=busybox --restart=Never -- sh
If you don't see a command prompt, try pressing enter.
/ # ping zookeeper.default.svc.cluster.local
PING zookeeper.default.svc.cluster.local (10.43.81.81): 56 data bytes
When I get a shell on a zookeeper instance and ping the local svc I get my router (with a dns rebind attack warning). I'll try bouncing coredns.
I'm trying to deploy zookeeper and kafka using this guide:
https://docs.bitnami.com/tutorials/deploy-scalable-kafka-zookeeper-cluster-kubernetes/
I am stuck on stage two where kafka is deployed using:
helm install kafka bitnami/kafka --set zookeeper.enabled=false --set replicaCount=3 --set externalZookeeper.servers=zookeeper.default.svc.cluster.local
I get the following log output as I understand to mean that zookeeper could not be reached - but when I start a busybox session I can see that zookeeper svc hostname resolves to an IP - can someone help me understand what is going wrong? The error seems to point to SASL authentication but isn't plaintext used by default?