decred / dcrstakepool

Stakepool for Decred.
Other
72 stars 75 forks source link

Prometheus metrics #525

Open isuldor opened 4 years ago

isuldor commented 4 years ago

Looking for feedback as I've added instrumentation to stakepoold in my environment to optionally export metrics for Prometheus. You can see the changes here: https://github.com/isuldor/dcrstakepool/pull/1/files

Obviously it doesn't make any sense to merge this upstream if no one else is using Prometheus, as it has the downside of requiring three Prometheus client libraries. But the upside to using Prometheus to monitor stakepoold directly is that it can detect outages pretty fast (ie. giving you a chance to fix things before dcrstakepool implodes on itself). If there's interest, I can write a tutorial on setting up Prometheus to scrape stakepoold and send out alerts when necessary. I've been sending alerts to my mobile using pushover.net (5 bucks for life) and there is also opsgenie (like pagerduty, but free for up to 5 users). I might also look into deploying this stuff with Docker (ala lndmon), but that's further down the line for me.

So, would you want this PR for exporting Prometheus metrics?

Update: Oh, these libraries being imported also include methods for parsing or bridging into other time-series/monitoring systems like InfluxDB, Sensu, Metricbeat, Datadog, etc. So an operator using any of those systems should be able to make use of these metrics. Apparently Openmetrics is going to be based on the Prometheus exposition format. If no one has strong objections, I'll submit a PR next week.

JoeGruffins commented 4 years ago

I've never used Prometheus, but it sounds good to me!