github / gh-ost

GitHub's Online Schema-migration Tool for MySQL
MIT License
12.43k stars 1.26k forks source link

Prometheus Monitoring #104

Open SuperQ opened 8 years ago

SuperQ commented 8 years ago

It would be good to integrate Prometheus client library for monitoring of long-running jobs.

shlomi-noach commented 8 years ago

@SuperQ thank you -- can you please elaborate? I'm actually not familiar with promotheus. What would be the benefits to gh-ost directly reporting from the app?

Actually, I'm having this exact same discussion on orchestrator. It supports graphite, but will now need to support datadog. And then maybe something else. I'm wondering whether the tool should explicitly support these, or maybe it should export metrics such that external monitoring tools read them.

SuperQ commented 8 years ago

Any time the internal state of the app is updated, for example starting, pausing, errors, copy progress, update counters and the Prometheus library allows these to be polled from the outside.

Then things can be graphed and displayed, or alerts written to discover stalled gh-ost jobs.

The Prometheus output format is very generic and can be used to feed other systems. It's not done yet, but we're working on splitting the prometheus library into the counter parts and exposition parts in order to make this easier. (Our golang counter code is very well optimized)

shlomi-noach commented 8 years ago

Any time the internal state of the app is updated, for example starting, pausing, errors, copy progress, update counters and the Prometheus library allows these to be polled from the outside.

To clarify, gh-ost would need to proactively tell promotheus about any state change, much like it would write to graphite. The "polled from the outside" is unrelated to gh-ost. Is that correct?

SuperQ commented 8 years ago

No, Prometheus is a polling-type metrics collector (think SNMP, but way better). gh-ost simply keeps track of the counters and Prometheus polls it. I'll look over the current code and see what needs to be done.

shlomi-noach commented 8 years ago

Recently implemented in orchestrator, I'd use go-metrics/exp to export in expvar-like format via /debug/metrics

SuperQ commented 8 years ago

expvar is not prometheus compatible. :-/