didi / KnowStreaming

一站式云原生实时流数据平台,通过0侵入、插件化构建企业级Kafka服务,极大降低操作、存储和管理实时流数据门槛
https://knowstreaming.com
GNU Affero General Public License v3.0
6.83k stars 1.26k forks source link

Kafka单机已连接,ZK连接失败,相关信息无法刷新 #1132

Open akinlong opened 10 months ago

akinlong commented 10 months ago

环境信息

重现该问题的步骤

  1. docker-compose启动集群

  2. 在页面配置单机kafka,配置如下 image

  3. 显示集群状态异常,同时后端出现报错 image image

预期结果

正常连接ZK并展示kafka信息

实际结果

无法连接ZK


如果有异常,请附上异常Trace:

2023-08-24 12:30:03.637 ERROR 11 --- [-2-10-thread-82] k.s.k.c.s.h.c.AbstractHealthCheckService : method=checkAndGetResult||clusterParam=ZookeeperParam(zkAddressList=[Tuple{v1=192.168.162.12, v2=2181}], zkConfig=null)||clusterHealthConfig=HealthAmountRatioConfig(amount=100000, ratio=0.8)||errMsg=exception!

java.lang.NullPointerException: null
        at com.xiaojukeji.know.streaming.km.core.service.health.checker.zookeeper.HealthCheckZookeeperService.checkWatchCount(HealthCheckZookeeperService.java:188)
        at com.xiaojukeji.know.streaming.km.core.service.health.checker.AbstractHealthCheckService.checkAndGetResult(AbstractHealthCheckService.java:50)
        at com.xiaojukeji.know.streaming.km.task.kafka.health.AbstractHealthCheckTask.checkAndGetResult(AbstractHealthCheckTask.java:109)
        at com.xiaojukeji.know.streaming.km.task.kafka.health.AbstractHealthCheckTask.lambda$calAndUpdateHealthCheckResult$0(AbstractHealthCheckTask.java:60)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

2023-08-24 12:30:03.637  INFO 11 --- [-2-10-thread-82] .x.k.s.k.c.s.v.BaseVersionControlService : method=doVCHandler||clusterId=2||action=ApproximateDataSize||type=ZookeeperMetric||param={"clusterPhyId":2,"metricName":"ApproximateDataSize","zkAddressList":[{"v1":"192.168.162.12","v2":2181}]}
2023-08-24 12:30:03.638 ERROR 11 --- [-2-10-thread-82] k.s.k.c.s.h.c.AbstractHealthCheckService : method=checkAndGetResult||clusterParam=ZookeeperParam(zkAddressList=[Tuple{v1=192.168.162.12, v2=2181}], zkConfig=null)||clusterHealthConfig=HealthAmountRatioConfig(amount=524288000, ratio=0.8)||errMsg=exception!

java.lang.NullPointerException: null
        at com.xiaojukeji.know.streaming.km.core.service.health.checker.zookeeper.HealthCheckZookeeperService.checkApproximateDataSize(HealthCheckZookeeperService.java:260)
        at com.xiaojukeji.know.streaming.km.core.service.health.checker.AbstractHealthCheckService.checkAndGetResult(AbstractHealthCheckService.java:50)
        at com.xiaojukeji.know.streaming.km.task.kafka.health.AbstractHealthCheckTask.checkAndGetResult(AbstractHealthCheckTask.java:109)
        at com.xiaojukeji.know.streaming.km.task.kafka.health.AbstractHealthCheckTask.lambda$calAndUpdateHealthCheckResult$0(AbstractHealthCheckTask.java:60)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

2023-08-24 12:30:03.639  INFO 11 --- [-2-10-thread-82] .x.k.s.k.c.s.v.BaseVersionControlService : method=doVCHandler||clusterId=2||action=NumAliveConnections||type=ZookeeperMetric||param={"clusterPhyId":2,"metricName":"NumAliveConnections","zkAddressList":[{"v1":"192.168.162.12","v2":2181}]}
2023-08-24 12:30:09.672  INFO 11 --- [kTP-6-thread-14] kafka.zookeeper.ZooKeeperClient          : [ZooKeeperClient KS-ZK-ClusterPhyId-1] Closed.
2023-08-24 12:30:09.672 ERROR 11 --- [kTP-6-thread-14] c.x.k.s.k.p.kafka.KafkaAdminZKClient     : method=createZKClient||clusterPhyId=1||clusterPhy=ClusterPhy(id=1, createTime=Thu Aug 24 10:18:49 GMT+08:00 2023, updateTime=Thu Aug 24 11:28:26 GMT+08:00 2023, name=192.168.162.12, bootstrapServers=192.168.162.12:9092, kafkaVersion=2.8.1, zookeeper=192.168.162.12:2181, clientProperties={}, jmxProperties={"jmxPort":9999,"maxConn":10,"openSSL":false}, zkProperties=, authType=0, runState=1, description=)||msg=create ZK Client failed||errMsg=exception

java.lang.InterruptedException: null
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2067)
        at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:273)
        at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:125)
        at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1948)
        at kafka.zk.KafkaZkClient.apply(KafkaZkClient.scala)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.KafkaAdminZKClient.createZKClient(KafkaAdminZKClient.java:137)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.KafkaAdminZKClient.createZKClient(KafkaAdminZKClient.java:121)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.KafkaAdminZKClient.getClient(KafkaAdminZKClient.java:45)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.zookeeper.service.impl.KafkaZKDAOImpl.getKafkaController(KafkaZKDAOImpl.java:153)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.zookeeper.service.impl.KafkaZKDAOImpl$$FastClassBySpringCGLIB$$6e05ff5c.invoke(<generated>)
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:771)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:749)
        at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:749)
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:691)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.zookeeper.service.impl.KafkaZKDAOImpl$$EnhancerBySpringCGLIB$$64aa1c82.getKafkaController(<generated>)
        at com.xiaojukeji.know.streaming.km.core.service.kafkacontroller.impl.KafkaControllerServiceImpl.getControllerFromZKClient(KafkaControllerServiceImpl.java:186)
        at com.xiaojukeji.know.streaming.km.core.service.kafkacontroller.impl.KafkaControllerServiceImpl.getControllerFromKafka(KafkaControllerServiceImpl.java:45)
        at com.xiaojukeji.know.streaming.km.task.kafka.metadata.SyncControllerTask.processClusterTask(SyncControllerTask.java:36)
        at com.xiaojukeji.know.streaming.km.task.kafka.metadata.AbstractAsyncMetadataDispatchTask.lambda$asyncProcessSubTask$0(AbstractAsyncMetadataDispatchTask.java:33)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

2023-08-24 12:30:09.672 ERROR 11 --- [kTP-6-thread-14] k.s.k.c.s.k.i.KafkaControllerServiceImpl : method=getControllerFromZKClient||clusterPhyId=1||errMsg=exception

com.xiaojukeji.know.streaming.km.common.exception.NotExistException: kafka kafka-zk-client not exist due to create failed
        at com.xiaojukeji.know.streaming.km.persistence.kafka.KafkaAdminZKClient.getClient(KafkaAdminZKClient.java:47)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.zookeeper.service.impl.KafkaZKDAOImpl.getKafkaController(KafkaZKDAOImpl.java:153)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.zookeeper.service.impl.KafkaZKDAOImpl$$FastClassBySpringCGLIB$$6e05ff5c.invoke(<generated>)
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:771)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:749)
        at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:749)
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:691)
        at com.xiaojukeji.know.streaming.km.persistence.kafka.zookeeper.service.impl.KafkaZKDAOImpl$$EnhancerBySpringCGLIB$$64aa1c82.getKafkaController(<generated>)
        at com.xiaojukeji.know.streaming.km.core.service.kafkacontroller.impl.KafkaControllerServiceImpl.getControllerFromZKClient(KafkaControllerServiceImpl.java:186)
        at com.xiaojukeji.know.streaming.km.core.service.kafkacontroller.impl.KafkaControllerServiceImpl.getControllerFromKafka(KafkaControllerServiceImpl.java:45)
        at com.xiaojukeji.know.streaming.km.task.kafka.metadata.SyncControllerTask.processClusterTask(SyncControllerTask.java:36)
        at com.xiaojukeji.know.streaming.km.task.kafka.metadata.AbstractAsyncMetadataDispatchTask.lambda$asyncProcessSubTask$0(AbstractAsyncMetadataDispatchTask.java:33)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

2023-08-24 12:30:09.672 ERROR 11 --- [kTP-6-thread-14] .t.k.m.AbstractAsyncMetadataDispatchTask : method=asyncProcessSubTask||taskName=SyncControllerTask||clusterPhyId=1||taskResult=TaskResult(code=-1, message=kafka kafka-zk-client not exist due to create failed)||msg=failed
akinlong commented 10 months ago

image 使用offset explorer可以正常连接。 Kafka集群版本2.8.1

ZQKC commented 10 months ago

image 使用offset explorer可以正常连接。 Kafka集群版本2.8.1

ks所在机器,用zk客户端连接一下zk看看是否可以成功。

akinlong commented 10 months ago

image 容器所在服务器可以正常连接对端zk

akinlong commented 10 months ago

image 使用offset explorer可以正常连接。 Kafka集群版本2.8.1

ks所在机器,用zk客户端连接一下zk看看是否可以成功。

image 同时,将zk客户端cp进ks manager容器后,执行连接也是成功的

akinlong commented 10 months ago

image 从ZK服务端日志来看,是客户端主动关闭了连接。 同时zk的四字已开启。

akinlong commented 10 months ago

神奇的事情发生了,手动在zk的数据表配置中把两个超时时间配置延长以后,就可以正常连接了,但是从抓包情况来看,连接时间貌似没有超过设置

niejian commented 9 months ago

神奇的事情发生了,手动在zk的数据表配置中把两个超时时间配置延长以后,就可以正常连接了,但是从抓包情况来看,连接时间貌似没有超过设置

哪个表来着?

ZQKC commented 8 months ago

神奇的事情发生了,手动在zk的数据表配置中把两个超时时间配置延长以后,就可以正常连接了,但是从抓包情况来看,连接时间貌似没有超过设置

哪个表来着?

我理解是 ks_km_physical_cluster 表

xuexh0816 commented 7 months ago

神奇的事情发生了,手动在zk的数据表配置中把两个超时时间配置延长以后,就可以正常连接了,但是从抓包情况来看,连接时间貌似没有超过设置

请问 是在哪里增加配置呢 key是哪个

jiangminbing commented 7 months ago

1701176461308 我也遇到同样的问题,请问如何解决的?

ZQKC commented 7 months ago

神奇的事情发生了,手动在zk的数据表配置中把两个超时时间配置延长以后,就可以正常连接了,但是从抓包情况来看,连接时间貌似没有超过设置

请问 是在哪里增加配置呢 key是哪个

可以看doc下的文档,上面有关于ZK的配置

ZQKC commented 7 months ago

1701176461308 我也遇到同样的问题,请问如何解决的?

这应该是获取ZK的指标失败导致的,可以执行一下四字命令试一下看看。

jiangminbing commented 7 months ago

zookeeper config 中添加 不限制四字命令 修改zookeeper.properties 4lw.commands.whitelist=*