sapcc / mosquitto-exporter

Prometheus metrics exporter for the Mosquitto message broker
Apache License 2.0
130 stars 60 forks source link

Broker up/down status #12

Open iwittkau opened 6 years ago

iwittkau commented 6 years ago

Is there a way to determine if the broker is reachable? If my broker goes down the exporter will expose the last metrics it was able to scrape. Im currently using something like

abs(broker_heap_current - (broker_heap_current offset 10m))

to determine if there are any changes in the heap. But that is not very reliable so far.

I'm using the 0.4.0 tagged docker image of the exporter.

ArtieReus commented 5 years ago

Based on the Broker Status documented here: https://mosquitto.org/man/mosquitto-8.html I don't see other option.

iwittkau commented 5 years ago

I think there is a simpler way: other exporters expose a metric like last_scrape_time. Then you are able to monitor the time since that timestamp. Also a counter metric like broker_connection_errors_sum is possible by counting the connection errors somewhere here: https://github.com/sapcc/mosquitto-exporter/blob/master/main.go#L167

Related suggestions from the prometheus docs: https://prometheus.io/docs/instrumenting/writing_exporters/#metrics-about-the-scrape-itself

If I find some time I will create a PR if that is OK.

daviddetorres commented 5 years ago

I was thinking to add a metric with the seconds since the last $SYS message. That can give rise an alarm if that value is higher that 60 seconds (or the $SYS update time configured in the broker).

Also the up/down status detected by the status of the connection with the broker can be a good idea and with the other metric can detect both scenarios for broker error:

  1. The broker breaks and connection is closed
  2. The socket and the connection is ok but the broker is stalled or saturated and doesn't send updates (and probably other messages in the queue).

If you think it is a good idea I can work in a PR about it.