albertodonato / query-exporter

Export Prometheus metrics from SQL queries
GNU General Public License v3.0
436 stars 101 forks source link

Add a metric for "last success" #164

Closed wrossmann closed 10 months ago

wrossmann commented 10 months ago

Is your feature request related to a problem? Please describe.

Recently one of our devs re-created a view that the exporter queries, but forgot to re-create the necessary permissions. From this point on the query errored out, but query-exporter simply kept serving the result of the last successful run. If it wasn't for the fact that the cached value itself was the subject of an alert that wouldn't resolve we might not have noticed this until something went horribly wrong while the exporter was still serving an "everything is A-OK" value.

Describe the solution you'd like

I would like to suggest adding a metric that simply lists the timestamp of the last successful run of each query. This way we can set an alert that is simply time() - query_last_success > 300 to be made aware of a failing query.

Describe alternatives you've considered

I've looked through the other values exposed by the exporter, but I don't see any that would be useful in calculating if the query ran successfully or not.

We could also technically parse the logs and watch for error lines, but that's more of a reactive solution. Not to mention that prometheus is entirely capable of monitoring itself in this respect.

Additional context

¯\_(ツ)_/¯

wrossmann commented 10 months ago

I am a dope. The queries_total{status="error"} metric is right there.

Please delete my shame. I would if I could.