tchiotludo / akhq

Kafka GUI for Apache Kafka to manage topics, topics data, consumers group, schema registry, connect and more...
https://akhq.io/
Apache License 2.0
3.37k stars 653 forks source link

Topics list fail due to topics with condition "partitions have leader brokers without a matching listener". #75

Open Expo33 opened 5 years ago

Expo33 commented 5 years ago

Hello, We observed this while testing the tool against a dev cluster. It has the appearance of a closed loop. I'm not skilled enough to identify a problem area in the code. Wondering how it looks to you.

Behaviour:

Observations:

DescribeTopicsOffsetsExceptionPic-1 DescribeTopicsOffsetsExceptionPic-2 ErrorReadTopicOffsets-SIT.txt application-example-yml.txt

tchiotludo commented 5 years ago

Thanks for the report. As I see during cluster rebalance, the consumer api will failed with a default timeout of 1 min. I will add a lower timeout by default, to avoid so long time wasted.

Also topic list failed for non mandatory information during the rebalance (Only to display size cols on topic list), losing the display of all other informations. I will try to catch the exception and display a N/A instead of throwing a 500 error page.

Not really sure that will be suffisant but I will try to reproduce on my side. To be honest, I never have a "chance" to have a cluster rebalance, so only nominal works is tested on my side. Just for helping me reproduce the issue, what is the size of your cluster (in topics count, in MB) ?

Expo33 commented 5 years ago

From the output of topics --list, there are 267 topics. For cluster size I'm trying the kafka-log-dirs command. I don't have the output organized yet to give total size in MB. Is there another way to get size?

tchiotludo commented 5 years ago

If you have root access on server, du -h on logs dir. KafkaHQ display size on topic list. But don't take too much time, it's not a small cluster 😁

Expo33 commented 5 years ago

Cluster is now 3 servers. server 1 (original solo): du -h = 46 G server 2 (recently added): du -h = 102 M server 3 (recently added): du -h = 124 M

tchiotludo commented 5 years ago

ok thanks for the information ! Medium size cluster with a really disparate replication :smile:

paul-lysak commented 4 years ago

I've got it reliably reproducing on fresh Kafka cluster with such steps:

  1. Run a 3-node Kafka cluster (in our case these are Docker images based on centos:centos8 with Kafka 2.4.0 installed from https://archive.apache.org/dist/kafka/2.4.0/kafka_2.13-2.4.0.tgz, but that shouldn't matter much)
  2. Run AKHQ: docker run -d --network kafka-training -p 8080:8080 --name akhq -vpwd/application.yml:/app/application.yml tchiotludo/akhq:0.14.1
  3. In AKHQ UI create a topic with replication factor 1 (btw, most likely it will be displayed incorrectly once the topic is created: https://github.com/tchiotludo/akhq/issues/326) and at least 3 partitions (the more partitions - the better for the reprodution).
  4. Shut down one of the brokers, refresh the AHKQ page - see warnings in the AKHQ logs for about half a minute:

    2020-07-03 12:29:38,161 WARN 1-thread-7 o.a.k.c.NetworkClient [Consumer clientId=consumer-Akhq-2, groupId=Akhq] 2 partitions have leader brokers without a matching listener, including [long_topic-2, lw2_topic-0]

and finally an error message in the browser:

Error for Describe Topics Offsets {}

java.lang.RuntimeException: Error for Describe Topics Offsets {}
at org.akhq.utils.Logger.call(Logger.java:26)
at org.akhq.modules.AbstractKafkaWrapper.describeTopicsOffsets(AbstractKafkaWrapper.java:121)
at org.akhq.modules.$KafkaWrapperRequestScopeDefinition$$exec6.invokeInternal(Unknown Source)

Long story short, having at least one off-line partition brings "Topics" page down, and as it is the deafult page - the whole UI with it.

tchiotludo commented 4 years ago

thanks @paul-lysak for the reproduction way :) I'm aware of this ! Most of the hard work to fix this was near (https://github.com/tchiotludo/akhq/pull/298). Using a more powerful front end will allow us to remove some sync heavy stuff on backend and let the front end build things async !