Closed amuraru closed 3 years ago
I think that's not desired for a lot of users as they use consumer group lags to monitor for potential issues (e.g. because the consuming application is no longer alive).
That can be a right usecase, I agree. Would it make sense to make this optional to define a list of group states for which thr offsets are reported? Default all.
again - in our environment there are lots of short lived cg and lots of low value metrics reported for them
Hmm I'm unsure.
I'd like to figure out what requests exactly take too long with the consumer groups. Maybe all issues are solved if you use the offsetsTopic
as scrape mode? The largest clusters I tested against had ~400 consumer groups and it was okayish (3-5s request duration) for describing the consumer groups.
Agree - the scrape time is one dimension the other is the amount of metrics scraped by prometheus.
The overall time decreased when applying https://github.com/cloudhut/kminion/pull/101 patch so that would be something you could check please
let's give this PR more thought - I agree
Closing - will address this more generically in https://github.com/cloudhut/kminion/issues/108
Reduce the number of metrics for clusters where lots are consumer groups are short-lived and empty. The proposed change is to list the consumer group offsets only for Stable groups.