bitprophet / descartes

Introspective dashboard for Graphite
MIT License
0 stars 0 forks source link

Graph/dash parameterization #1

Open bitprophet opened 11 years ago

bitprophet commented 11 years ago

The problem

Use case is to "plug in" external values when generating graphs or dashboards, instead of only working with 100% literal/static metric paths.

For example, say you have a cluster of servers s01, s02, s03 and s04 and you want all of their load averages in one graph.

Currently one must load up the Metrics screen, filter for the metric(s) you want, and add to the in-progress graph until you're happy. So in the above example, to build a graph of your cluster's 5-minute load average, you would:

Scales real poorly for nontrivial cluster sizes, is annoying at best for trivial ones.

Alternately:

Scales better for individual clusters, but requires its own annoying song and dance & thus doesn't scale when you need to manage many clusters.

Possible solutions

  1. Subject metric paths to a templating solution. E.g. define a graph as servers.$cluster01.loadavg.05, the parse step looks in a config file or talks to an API, and expands $cluster01 into so[1-4].
    • Reasonably straightforward
    • Requires some sort of UI change to account for "metrics" no longer being 1:1 maps to real values in Graphite, and how that affects the graph builder.
      • Perhaps an option to build graphs from hand-entered metric paths (like in Graphite's own composer) which can then be parse/expansion aware. This would be a useful mode even without any expansion implemented, really.
    • Not the most elegant thing ever
    • If API driven (vs config-managed conf file) might want to piggyback on existing caching to avoid querying the external API constantly
  2. Create real models for Clusters/Services/Hosts etc, and have a "node-based" (in the Graphite sense) metric path concept. E.g. servers.<server>.loadavg.05, then Descartes loads up that path and shows you the graphs with all existing Cluster or Host values plugged in.
    • Similar approach to previous, but more organized
    • Allows for navigation concepts like "I want to view CPU for (all my hosts|just hosts in my prod env|just hosts in Cassandra Cluster 2|etc)", drill down, etc.
      • Technically we could apply that to the first option too, you'd just be choosing from the arbitraryish list of expandos.
    • Problem: starts pulling organization from your real truth database into Descartes' DB schema. The deeper you build out these concepts in Descartes, the higher chance for conflict with how your existing systems organize the same info.
    • Problem: by storing the external truth DB's info in Postgres, you open the door to sync problems you wouldn't have if simply caching "dumb" X=Y expansion maps.
  3. Have "dummy" model objects which are transient and wrap the remote DB's data. I.e. from a UI perspective they appear to be useful objects in certain classes, but from a data storage perspective it's all API driven (again possibly with caching).
    • This is another mutation on its predecessor
    • Might be best of both worlds - give some form to the expansions that are going on, but don't try to keep a persistent copy of the data.
bitprophet commented 11 years ago

Talking to truth DB

Storage/organization on Descartes side

bitprophet commented 11 years ago

Digging into specifics:

Need to figure out:

bitprophet commented 11 years ago

Have basic interpolation working \o/ and it does indeed seem to work everywhere.

Now running into:

bitprophet commented 11 years ago

Also don't see a way to set aliases easily via Descartes, which is another reason the legend is so enormous for my above graph. Torn between it being a Descartes responsibility and being something I should stuff into Graphite (so now this mythical function combines nonNegativeDerivative, scaleToSeconds and aliasByNode).

bitprophet commented 10 years ago

Revisiting this. Outstanding issues: