Closed der-eismann closed 3 months ago
This is open source software, you are welcome to contribute what you find missing.
The data comes from rabbit_vm:memory/0
and the metrics belong to this group.
Wow, I wasn't even able to finish my Erlang introductory course in that short time. Thanks for adding these metrics so quickly!
These metrics are fairly expensive with many queues and streams, so we will limit this to 4.0 and look for ways to optimize this or make this opt-in.
@der-eismann We have now merged this into main/4.0 (but not 3.13). There's a dedicated endpoint for these metrics: https://www.rabbitmq.com/docs/next/prometheus#memory-breakdown-endpoint
However, I struggle to find a nice Grafana vizualization for these metrics. There are quite a few of them and multiple by the number of nodes in the cluster, you get a lot of data points. Are you currently visualizing these metrics from the exporter? Can you share wha that looks like? Ideally, if you could contribute a panel for them, that'd be great.
The RabbitMQ Overview dashboard JSON source file is here if you want to give it a try: https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_prometheus/docker/grafana/dashboards/RabbitMQ-Overview.json
Hey @mkuratczyk, we used these metrics to figure out why the memory consumption is so high and with them we noticed that a huge chunk is allocated unused
. The visualization is more of a quick and dirty kind, but I can try to polish to contribute it.
But these are from the old exporter, we don't have the 4.0 beta running yet. Need to invest some time for that, not sure when I can find that in the next two weeks.
That's ok, no rush. Seems like the external exporter provided fewer metrics and you still presented them separate for each node (which totally makes sense). As usual, the problem for us is that when we provide something, users expect it to "just work everywhere" and some users have 9 nodes in the cluster or more so that's suddenly quite a few new panels. Perhaps a separate dashboard would be useful. Then we can just do it per node and use Grafana's repeat
option.
Is your feature request related to a problem? Please describe.
Hey everyone, we are currently working on replacing the soon-to-be EOL https://github.com/kbudde/rabbitmq_exporter with the built-in prometheus plugin. With that exporter it was possible to get detailed memory statistics from the management plugin, which have helped us debug issues: https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_management/priv/www/js/tmpl/memory.ejs#L9-L31
Unfortunately I was unable to get these metrics from the prometheus plugin, the only thing that came close was
process_resident_memory_bytes
(https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_prometheus/src/collectors/prometheus_rabbitmq_core_metrics_collector.erl#L72)Describe the solution you'd like
Provide all memory metrics from the management UI via prometheus plugin
Describe alternatives you've considered
No response
Additional context
No response