Open jason-da-redpanda opened 1 year ago
Can you please list out the metrics you would like to see?
hi @jcsp I updated the description to have bit more detail on the kind of things we are looking for ..albeit at high level.. Do you want the actual suggested metrics names listed ...for each ..? This sort of thing join-time-avg, join-time-max , join-total (just example)
Who is this for and what problem do they have today?
We do not really have good metrics for Consumer Group basics ... e.g lag, rebalancing/rejoin, heatbeats, latency
The kind of thing we are typically interested in knowing is
This is for Redpanda admins/Support trying to troubleshoot Consumer Group issues
What are the success criteria?
metrics exposed for things like : join-rate* , heartbeats, lag , latencies
Why is solving this problem impactful?
Because it helps us troubleshoot issues with CG currently ... if we want to see some of this stuff , for example "Handling join request/PreparingRebalance" we nave to turn on TRACE,,, and for "kafka" which is very noisy.
Additionally with metrics.. customers can have alerts defined for things such as CG's having high amount of rebalancing
Additional notes
for inspo: Consumer Group Metric
JIRA Link: CORE-1092