raintank / worldping-api

Worldping Backend Service
Other
25 stars 18 forks source link

making alerting mergable into grafana/grafana #7

Closed woodsaj closed 8 years ago

woodsaj commented 8 years ago

Issue by Dieterbe Monday Jun 22, 2015 at 00:35 GMT Originally opened as https://github.com/raintank/grafana/issues/230


what needs to happen before we can merge alerting feature into grafana/grafana? here's my thoughts:

once this is done, we can start working on arbitrary/user-specified alerts

woodsaj commented 8 years ago

Comment by Dieterbe Thursday Aug 13, 2015 at 13:32 GMT


other things to keep in mind:

woodsaj commented 8 years ago

Comment by Dieterbe Monday Aug 31, 2015 at 08:10 GMT


@torkelo i think all of this depends a lot on how you think plugins should be implemented in grafana. can you start thinking about the stuff above, and maybe the best way to make progress on this is do a hangout together, where we sketch out a design, and go over the current alerting code to see how we would refactor that using the plugin system.

woodsaj commented 8 years ago

Comment by woodsaj Monday Aug 31, 2015 at 11:43 GMT


I would like to be involved in any discussion on this.

splitting the raintank code into a separate isolated repository is moving up the priority list.

woodsaj commented 8 years ago

Comment by torkelo Monday Aug 31, 2015 at 12:00 GMT


I think before we know what it would take copy any of the raintaink alerting code to grafana we need to start working on alerting in grafana from the UI and domain model perspective. When we get to scheduling and query execution we can see how we can utilize what has been written and what modifications are required.

Does this makes sense?

This it could go like this 1) Start working on frontend alerting tab (and panel alert definition model, should be pretty simple, reference a metric query, thresholds, timespan or nr of points rule, notification tag 2) Build a backend cache / model that indexes all alerting rules 3) How can that alerting model index feed a scheduler (or http api with external scheduler)

woodsaj commented 8 years ago

Comment by Dieterbe Monday Aug 31, 2015 at 12:46 GMT


I think the UI work will take quite a while to take shape. we already have a pretty good idea of how things will look frontend wise (rules in the dashboard json) so I think we can start working on the backend as well, i.e. your 2 and 3 which includes pluginizing the litmus handler. If we have the resources to work on both (2-3 and 1) at the same time, I think we should. And I definitely have the time to work on 2/3, though I do need guidance to make sure it fits within upstream grafana which also takes some time. I think serializing these tasks that could be worked on in parrallel will needlessly slow us down.

with my raintank-metric to nsq project wrapping up i was hoping to start working on alerting again. but i guess i could also do something else like probe/tsdb. .. @woodsaj ?

woodsaj commented 8 years ago

Comment by torkelo Monday Aug 31, 2015 at 12:50 GMT


@Dieterbe you could start working on building an alerting domain model / cached index , based on json in each panel in the dasboards. This will require quite a bit of backend code I think (to keep up to date and synced across grafana-servers)

Each dashboards's panel will be able to have an alerting rule definition. But in order to make alerting overview pages and for the scheduler or api to list "panels with alerting" we need an index and is up to date.

woodsaj commented 8 years ago

Comment by Dieterbe Tuesday Sep 01, 2015 at 12:23 GMT


step 1 needs https://github.com/grafana/grafana/issues/2643

woodsaj commented 8 years ago

no longer rel event.