Open luos opened 2 weeks ago
We can both use a tag (if we do so for other Ra machine process types) and provide a new endpoint or a metric family.
Thanks, yeah, I was thinking similar to what the-mikedavis proposed in the PR.
I think it is worth considering a different family, ie. we may have a many thousands of queues but only one/few metadata processes, and I expect we will care more about khepri than QQ indexes, though I can't say for sure at this point in time. :-)
At the same time, I do not expect the khepri process to take a lot of traffic - but I am sure it will happen.
Hi,
We're testing out Khepri and reviewing how we could monitor its behaviour and performance.
Today, we monitor mnesia transaction counters to see if there is a high amount of churn in the system - mostly because mnesia can cause some issues if the transaction count / restarts are very high.
We've noticed, that the metrics in the metric family
ra_metrics
show up without any tags, which I think potentially should be either in a different family, ie.khepri
ormetadata
, or they should have proper tags, ie. for therabbit_metadata|quorum_queues, etc
.Test setup:
curl localhost:15692/metrics/detailed?family=ra_metrics
rabbitmqctl eval 'rabbit_khepri:status().'
Excerpt from the output:
Describe the solution you'd like
ra_process=rabbit_metadata
Describe alternatives you've considered
I tried to look at the metric collection code, but from my cursory review I could not figure out how to add the tag, and not even sure that would be the preferred way to go about it. đŸ˜„
Additional context
Due to Khepri's consistency behaviour with the projections, It would be good to know if a node is falling behind.