Closed ElfoLiNk closed 3 weeks ago
@ElfoLiNk You can change the scrape mode to Kafka API. If I'm not wrong this should only happen with the offsets topic scrape mode. KMinion basically collected the group offsets, but it's a different API call to retrieve the log-end-offset for each partition in these topics. Both the end offset and the group offset are obviously needed to calculate the lag. The warning itself shouldn't cause KMinion to not work in general, but it won't be able to report group lags for this specific topic.
Let me know if that helps
Hi @weeco thank you for the quick response, i'm already using the adminApi
minion:
consumerGroups:
enabled: true
scrapeMode: adminApi
granularity: partition
Ok that's pretty odd. Besides the warning, could you explain what is not working? Are you missing metrics in the output? Are they consistently missing or just occasionally?
I think the issue is that EventHub lowercase topic names, I have in the metrics for example ENV_topicName and env_topicname.
kminion_kafka_consumer_group_topic_lag -> topic_name label = ENV_topicName
kminion_kafka_topic_partition_high_water_mark -> topic_name label = env_topicname
I also tried to use KMinion with Eventhubs. In general everything works, but I think the main problem is how DescribeConsumerGroups behaves for the Kafka API of the Eventhub.
The DescribeConsumerGroups call only reports offsets as long as there is at least one consumer in the group. When the last consumer stops consuming, the call will stop reporting offsets some seconds afterwards.
So, when e.g. a consumer dies and thus no longer consumes, the lag will rise but we are unable to see it because the group is no longer reported by DescribeConsumerGroups.
I think the only way to fix this, would be to cache DescribeConsumerGroups responses and use the last valid one when an empty response is returned. But I don't know if this is something that you want to see in KMinion @weeco .
I think this is a bug / protocol violation in Eventhub. We cannot make changes for invalid implementations of the Kafka protocol given there are so many different implementations. I recommend raising this with the Eventhub team, this should have an impact on a lot of administrative tools.
I also think it is a protocol violation in Eventhub. I could not find any documentation concerning this, but with this no kafka metrics exporter will work with eventhubs.
Would be good to support Azure EventHub since has the Kafka API.
Right now in the kminion logs I see the following warning:
"consumer group has committed offsets on a topic we don't have watermarks for"
kafka-consumer-groups cli it's able to display lag information from azure eventhub: