danielqsj / kafka_exporter

Kafka exporter for Prometheus
Apache License 2.0
2.1k stars 602 forks source link

Cannot get oldest offset of topic kafkatopic2 partition 1: #311

Open aditya-iaxis opened 2 years ago

aditya-iaxis commented 2 years ago

Hello All/ @danielqsj , I am facing the following issue.

Cannot get oldest offset of topic kafkatopic2 partition 1: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes.

Get "http://172.40.1.32:9308/metrics": dial tcp 172.40.1.32:9308: connect: connection refused and sometimes context length exceeded.

Sometimes the kafka metric in prometheus targets shows down and sometimes up. It keeps toggling

However, when I check the kafka pod logs, it does not show any errors. There are several other apps using the same kafka too and they have no issues.

Also, the above metrics link takes me like 30-40 secs to open. Any pointers for this?

shengbinxu commented 8 months ago

I also encountered similar problem. My log displays: "Cannot get oldest offset of topic kafkatopic2 partition 1: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes." And then, "metrics link takes me like 30-40 seconds to open."

This is because the Kafka service is experiencing an issue where some topic partitions have no leader. This results in the exporter taking a long time to retrieve statistics for this topic, causing the metrics link to be very slow.

In this situation: 1、it is advisable to increase the scrape_timeout for Prometheus. 2、In our business context, when the exporter encounters the error mentioned above, it is because one broker has failed, and some topic partitions are indeed unable to write data.