vernemq / vernemq

A distributed MQTT message broker based on Erlang/OTP. Built for high quality & Industrial use cases. The VerneMQ mission is active & the project maintained. Thank you for your support!
https://vernemq.com
Apache License 2.0
3.24k stars 395 forks source link

Configure monitoring so that metrics show data for the whole cluster #1562

Closed HowellTan closed 4 years ago

HowellTan commented 4 years ago

Environment

allow_anonymous = off
listener.tcp.default = 0.0.0.0:1883
listener.ssl.default = 0.0.0.0:8883
listener.ws.default = 0.0.0.0:8080
listener.wss.default = 0.0.0.0:8081
listener.tcp.allowed_protocol_versions = 3,4,5
distributed_cookie = ...

plugins.vmq_diversity = on
plugins.vmq_passwd = off
plugins.vmq_acl = off
vmq_diversity.auth_postgres.enabled = on
vmq_diversity.postgres.host = ...
vmq_diversity.postgres.port = 5432
vmq_diversity.postgres.user = ...
vmq_diversity.postgres.password = ...
vmq_diversity.postgres.database = ...

nodename = VerneMQ@${IP_ADDRESS}
listener.vmq.clustering = ${IP_ADDRESS}:44053
listener.vmqs.clustering = ${IP_ADDRESS}:18884

Expected behavior

Configure VerneMQ such that the metrics includes all nodes (i.e. show data for the entire cluster).

Actual behaviour

I have read the docs for Prometheus VerneMQ. For context, I have a VerneMQ cluster with 3 nodes behind a load balancer. I want to use Prometheus to scrape VerneMQ metrics, but each request to :8888/metrics only has data for one specific node at a time.

ioolkos commented 4 years ago

Hi @HowellTan, thanks for your input. Yes, that's how Prometheus scraping is implemented currently (and it's likely to stay like that because otherwise, Verne nodes would have to somehow forward the scrape request. Or have all the metrics replicated & ready).

tartieret commented 4 years ago

@ioolkos I have the same issue, how do you suggest monitoring a vernemq cluster with multiple replicas?

For instance if I have a swarm cluster with three nodes, then prometheus scraps metrics from only one node at a time.

ioolkos commented 4 years ago

@tartieret @HowellTan by scraping every node and then separate metrics by node name in Grafana. You could do an aggregated view (where cluster nodes are shown in a single view), or some form of drill-down view for each node from a general pane.

I know the Grafana thing in this repo is only very basic. I've seen people building elaborated views, but no project has given back what they built so far (maybe for the reason that it seemed too specific for a use case).

ioolkos commented 4 years ago

@tartieret Netdata recently also added great integration for VerneMQ. You can add multiple servers in the config file too: https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/vernemq

ioolkos commented 4 years ago

I guess this is answered. Feel free to comment or reopen.

tartieret commented 4 years ago

@ioolkos thanks for the info!