abraham-leal / kafka-idle-topics

A tool to detect idle topics in your cluster
Apache License 2.0
19 stars 2 forks source link

broken pipe errors when executing the utility #6

Closed apsiyer closed 8 months ago

apsiyer commented 10 months ago

Hi, I was trying to execute this utility against confluent Cloud Kafka to compile a list of idle topics but with whatever setting i run It fails with Broken Pipe Error. The said sever has more than 1500 topics and 600 Consumer groups.

./kafka-idle-topics -bootstrap-servers pkc-**.northeurope.azure.confluent.cloud:9092 -username **REMOVED -password REMOVED -kafkaSecurity plain_tls -idleMinutes 43200 -productionAssessmentTimeMs 43200 2023/10/19 17:12:46 Loading Topics... 2023/10/19 17:12:49 Evaluating Topics that haven't been produced to since... 2023-09-19 17:12:49.563518 +0100 BST m=-2591995.340513124 2023/10/19 17:31:53 Evaluating Topics without active Consumer Groups... 2023/10/19 17:31:53 Could not obtain Consumer Groups from cluster: write tcp 192.168.0.48:56237->20.105.49.156:9092: write: broken pipe

Other options that I tried.

./kafka-idle-topics -bootstrap-servers pkc-**.northeurope.azure.confluent.cloud:9092 -username **REMOVED -password REMOVED -kafkaSecurity plain_tls -idleMinutes 262800

Could you please help me to identify what is the issue here ? Thank you A

aadubey commented 10 months ago

seems like a sarama admin client problem with broker idle timeout disconnections - https://github.com/IBM/sarama/issues/1796

The topic evaluation loop too long which made the consumer group call fail fast.

abraham-leal commented 10 months ago

Interesting find, thanks for reporting this issue. It sounds like Sarama's OffsetNewest might be a bit inefficient, I'd need to look into a dump to identify if it's that or the actual consume operation for a large amount of partitions. In the meantime, filtering down your evaluation into multiple sets will probably allow you to do the whole thing, just in a bit of a longer timeframe. I'll take a look into this when I have time, or if you'd like to contribute, I'll be more than happy to review!

SandeepSehra commented 10 months ago

We also get the same error whilst running it through docker if you can help us debug it much appreciated .

~ docker run \ -e KAFKA_BOOTSTRAP="lkc--.eu-west-1.aws.glb.confluent.cloud:9092"\ -e KAFKA_USERNAME=""\ -e KAFKA_PASSWORD=""\ abrahamleal/kafka-idle-topics:latest-arm64 2023/11/07 16:21:09 lkc-**-***.eu-west-1.aws.glb.confluent.cloud:9092 2023/11/07 16:21:09 API KEY 2023/11/07 16:21:09 API KEY SECRET 2023/11/07 16:21:09 plain_tls 2023/11/07 16:21:13 Loading Topics... 2023/11/07 16:21:18 Evaluating Topics without any active production... 2023/11/07 16:31:19 Waiting for 30000 ms to evaluate active production. 2023/11/07 16:41:42 Evaluating Topics without active Consumer Groups... 2023/11/07 16:41:43 Could not obtain Consumer Groups from cluster: EOF

abraham-leal commented 10 months ago

@SandeepSehra For now, the solution would be to evaluate a subset of topics at a time. I think I have a solution in mind but haven't had time to implement it. For example, if you are trying to evaluate 100 topics maybe you should do two sets of 50, or 3 sets of 33/34. Does that make sense?

SandeepSehra commented 10 months ago

@abraham-leal yes it does , in which were trying to evaluate our non-prod cluster which has 1098 topics and then similar to prod which has 555 topics .

If you can suggest how to analyse this , as i've tried different stuff but still no success .

abraham-leal commented 10 months ago

I'd suggest evaluating 100 topics or so at a time.

apsiyer commented 10 months ago

Apologies for the silly question. Could you please give me an example command line on how to input partial set of topics ?

abraham-leal commented 8 months ago

@apsiyer in the current version, you may limit the amount of topics through permissions for the credentials given to the tooling. The current master branch has a capability to provide allow/disallow lists.

Additionally, I've merged to master a fix for this issue by creating new clients for every cluster interaction. if you have a chance to test the master build, I'd appreciate it. I'll attach the binary. kafka-idle-topics.zip

@SandeepSehra I would also appreciate if you could give it a spin :)