Open selkabli opened 5 years ago
Looks like two kafka pods succeed and one fails. It could be https://github.com/Yolean/kubernetes-kafka/commit/463e1c75424c5daf993710c1858df9782c0ed77c though that would be strange because there are 5 zookeeper pods to reach for 3 kafka brokers. Does everything but kafka-2 stay ready or is there other events? Do zookeeper services have the expected endpoints?
Please use ``` when you post command ouput. Makes it a lot more readable. See https://guides.github.com/features/mastering-markdown/
i changed zookeeper config to maxClientCnxns=2
but te same issue still persiste
[root@node1 ~]# kubectl get svc -n kafka
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
bootstrap ClusterIP 10.233.9.140 <none> 9092/TCP 13h
broker ClusterIP None <none> 9092/TCP 13h
pzoo ClusterIP None <none> 2888/TCP,3888/TCP 13h
zoo ClusterIP None <none> 2888/TCP,3888/TCP 13h
zookeeper ClusterIP 10.233.35.111 <none> 2181/TCP 13h
[root@node1 ~]# kubectl describe svc zookeeper -n kafka
Name: zookeeper
Namespace: kafka
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"zookeeper","namespace":"kafka"},"spec":{"ports":[{"name":"client"...
Selector: app=zookeeper
Type: ClusterIP
IP: 10.233.35.111
Port: client 2181/TCP
TargetPort: 2181/TCP
Endpoints: 10.233.90.24:2181,10.233.90.26:2181,10.233.92.33:2181 + 2 more...
Session Affinity: None
Events: <none>
[root@node1 ~]# kubectl get pods -n kafka -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kafka-0 1/1 Running 0 13h 10.233.92.35 node3 <none> <none>
kafka-1 1/1 Running 0 13h 10.233.96.34 node2 <none> <none>
kafka-2 0/1 CrashLoopBackOff 14 52m 10.233.90.31 node1 <none> <none>
pzoo-0 1/1 Running 0 13h 10.233.96.30 node2 <none> <none>
pzoo-1 1/1 Running 1 13h 10.233.92.33 node3 <none> <none>
pzoo-2 1/1 Running 1 13h 10.233.90.24 node1 <none> <none>
zoo-0 1/1 Running 0 13h 10.233.96.32 node2 <none> <none>
zoo-1 1/1 Running 1 13h 10.233.90.26 node1 <none> <none>
I'm puzzled. At this point I can't come up with a single hypothesis to test. Something might come to mind later, but my only advice now is to dig around and do different experiments that involve killing pods.
Edit: zookeeper logs could possibly provide clues.
@solsson I also reported the same error.When I modify kafka and zk namespace Other namespace 。initing kafka init-config reported error:
if The namespace is kafka, the cluster init is normal and the connection to zk is normal.But this is not what I want, my project is in other namespaces。 so kafka namespace is kafka,zk is other namesapce。To solve the problem across namespace, I created a service in namespace kafka: apiVersion: v1 kind: Service metadata: name: kafka-zk-port2 namespace: kafka spec: ports:
then,reported The above error:
[2019-06-26 05:52:11,975] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2019-06-26 05:52:12,472] INFO starting (kafka.server.KafkaServer)
[2019-06-26 05:52:12,472] INFO Connecting to zookeeper on zk-cluster-0.zk-cli.zhihuiaj.svc.cluster.local:2181,zk-cluster-1.zk-cli.zhihuiaj.svc.cluster.local:2181,zk-cluster-2.zk-cli.zhihuiaj.svc.cluster.local:2181 (kafka.server.KafkaServer)
[2019-06-26 05:52:12,492] INFO [ZooKeeperClient] Initializing a new session to zk-cluster-0.zk-cli.zhihuiaj.svc.cluster.local:2181,zk-cluster-1.zk-cli.zhihuiaj.svc.cluster.local:2181,zk-cluster-2.zk-cli.zhihuiaj.svc.cluster.local:2181. (kafka.zookeeper.ZooKeeperClient)
[2019-06-26 05:52:12,497] INFO Client environment:zookeeper.version=3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 00:39 GMT (org.apache.zookeeper.ZooKeeper)
[2019-06-26 05:52:12,497] INFO Client environment:host.name=kafka-0.kafka-cluster.kafka.svc.cluster.local (org.apache.zookeeper.ZooKeeper)
[2019-06-26 05:52:12,497] INFO Client environment:java.version=11.0.2 (org.apache.zookeeper.ZooKeeper)
[2019-06-26 05:52:12,497] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper)
[2019-06-26 05:52:12,497] INFO Client environment:java.home=/usr/lib/jvm/jdk-11 (org.apache.zookeeper.ZooKeeper)
[2019-06-26 05:52:12,497] INFO Client environment:java.class.path=/opt/kafka/libs/extensions/*:/opt/kafka/bin/../libs/activation-1.1.1.jar:/opt/kafka/bin/../libs/aopalliance-repackaged-2.5.0-b42.jar:/opt/kafka/bin/../libs/argparse4j-0.7.0.jar:/opt/kafka/bin/../libs/audience-annotations-0.5.0.jar:/opt/kafka/bin/../libs/commons-lang3-3.8.1.jar:/opt/kafka/bin/../libs/connect-api-2.2.1.jar:/opt/kafka/bin/../libs/connect-basic-auth-extension-2.2.1.jar:/opt/kafka/bin/../libs/connect-file-2.2.1.jar:/opt/kafka/bin/../libs/connect-json-2.2.1.jar:/opt/kafka/bin/../libs/connect-runtime-2.2.1.jar:/opt/kafka/bin/../libs/connect-transforms-2.2.1.jar:/opt/kafka/bin/../libs/extensions:/opt/kafka/bin/../libs/guava-20.0.jar:/opt/kafka/bin/../libs/hk2-api-2.5.0-b42.jar:/opt/kafka/bin/../libs/hk2-locator-2.5.0-b42.jar:/opt/kafka/bin/../libs/hk2-utils-2.5.0-b42.jar:/opt/kafka/bin/../libs/jackson-annotations-2.9.8.jar:/opt/kafka/bin/../libs/jackson-core-2.9.8.jar:/opt/kafka/bin/../libs/jackson-databind-2.9.8.jar:/opt/kafka/bin/../libs/jackson-datatype-jdk8-2.9.8.jar:/opt/kafka/bin/../libs/jackson-jaxrs-base-2.9.8.jar:/opt/kafka/bin/../libs/jackson-jaxrs-json-provider-2.9.8.jar:/opt/kafka/bin/../libs/jackson-module-jaxb-annotations-2.9.8.jar:/opt/kafka/bin/../libs/javassist-3.22.0-CR2.jar:/opt/kafka/bin/../libs/javax.annotation-api-1.2.jar:/opt/kafka/bin/../libs/javax.inject-1.jar:/opt/kafka/bin/../libs/javax.inject-2.5.0-b42.jar:/opt/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/kafka/bin/../libs/javax.ws.rs-api-2.1.1.jar:/opt/kafka/bin/../libs/javax.ws.rs-api-2.1.jar:/opt/kafka/bin/../libs/jaxb-api-2.3.0.jar:/opt/kafka/bin/../libs/jersey-client-2.27.jar:/opt/kafka/bin/../libs/jersey-common-2.27.jar:/opt/kafka/bin/../libs/jersey-container-servlet-2.27.jar:/opt/kafka/bin/../libs/jersey-container-servlet-core-2.27.jar:/opt/kafka/bin/../libs/jersey-hk2-2.27.jar:/opt/kafka/bin/../libs/jersey-media-jaxb-2.27.jar:/opt/kafka/bin/../libs/jersey-server-2.27.jar:/opt/kafka/bin/../libs/jetty-client-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jetty-continuation-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jetty-http-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jetty-io-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jetty-security-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jetty-server-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jetty-servlet-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jetty-servlets-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jetty-util-9.4.14.v20181114.jar:/opt/kafka/bin/../libs/jopt-simple-5.0.4.jar:/opt/kafka/bin/../libs/kafka-clients-2.2.1.jar:/opt/kafka/bin/../libs/kafka-log4j-appender-2.2.1.jar:/opt/kafka/bin/../libs/kafka-streams-2.2.1.jar:/opt/kafka/bin/../libs/kafka-streams-examples-2.2.1.jar:/opt/kafka/bin/../libs/kafka-streams-scala_2.12-2.2.1.jar:/opt/kafka/bin/../libs/kafka-streams-test-utils-2.2.1.jar:/opt/kafka/bin/../libs/kafka-tools-2.2.1.jar:/opt/kafka/bin/../libs/kafka_2.12-2.2.1-sources.jar:/opt/kafka/bin/../libs/kafka_2.12-2.2.1.jar:/opt/kafka/bin/../libs/log4j-1.2.17.jar:/opt/kafka/bin/../libs/lz4-java-1.5.0.jar:/opt/kafka/bin/../libs/maven-artifact-3.6.0.jar:/opt/kafka/bin/../libs/metrics-core-2.2.0.jar:/opt/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/kafka/bin/../libs/plexus-utils-3.1.0.jar:/opt/kafka/bin/../libs/reflections-0.9.11.jar:/opt/kafka/bin/../libs/rocksdbjni-5.15.10.jar:/opt/kafka/bin/../libs/scala-library-2.12.8.jar:/opt/kafka/bin/../libs/scala-logging_2.12-3.9.0.jar:/opt/kafka/bin/../libs/scala-reflect-2.12.8.jar:/opt/kafka/bin/../libs/slf4j-api-1.7.25.jar:/opt/kafka/bin/../libs/slf4j-log4j12-1.7.25.jar:/opt/kafka/bin/../libs/snappy-java-1.1.7.2.jar:/opt/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/opt/kafka/bin/../libs/zkclient-0.11.jar:/opt/kafka/bin/../libs/zookeeper-3.4.13.jar:/opt/kafka/bin/../libs/zstd-jni-1.3.8-1.jar (org.apache.zookeeper.ZooKeeper)
[2019-06-26 05:52:12,497] INFO Client environment:java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2019-06-26 05:52:12,497] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2019-06-26 05:52:12,497] INFO Client environment:java.compiler=
@amateu It looks like yours is a custom setup with ExternalName for zookeeper. Why don't you edit zookeeper.connect
in Kafka's config instead? In addition you seem to have quite specific RBAC in your cluster and you probably need to customize the RBAC resources.
With @selkabli's issue what is most interesting is that only kafka-2 fails. I think in your setup @amateu all brokers will fail.
@solsson ,yes,it's all brokers will fail.The reason is really caused by rbac, I tried to create a rbac on my project to deploy zk and kafka instead of namespace is kafka. But still the connection zk timeout。 So, I deployed zk and kafka in another clean test environment, not using rbac. But still the connection zk timeout. The same mistake as before. Finally, I changed the yml of zk. Zk and kafka clusters are normal。 I still can't find the specific reason for the previous problem. With @selkabli's issue,I think he might have used hostNetwork: true
@solsson the problem happen only on node1 whish is the master of my cluster any clues why ?
the taint is already removed from master so it's not related to taint
That's an important observation. I haven't tried running on a mastter. I have no clue why the zookeeper connection would fail from there.
having the same issue as @selkabli, I am deploying on bear-metal k8s cluster with local persistent volume. 1 broker (out of 3) always failed to start correctly.
nvm, seems the pv on one of the node having problem which cause this. I changed the pv to another node, it works fine.
Hi, this is my first time using kafka so maybe i'm messing somthing can you please help