andymchugh / andrewbmchugh-flow-panel

Apache License 2.0
34 stars 2 forks source link

Composite metrics #48

Closed madansu closed 3 months ago

madansu commented 4 months ago

How can we define composite metrics ?

I am trying to evaluate a box color between 3 colors, based on 3 different metrics - CPU, HTTP Status and ICMP Reachability. I have 3 time series as - box1_cpu, box1_icmp and box1_httpstat Each of these series have their own definitions for thresholds.

Any recommendations on how I can combine these metrics above into a single status indicator ? I would use fillColor as my indicator of choice.

Thanks in advance !

andymchugh commented 4 months ago

If I'm following correctly, you have three timeseries that you want to aggregate together in a special way to create a new timeseries from which you would drive a cells fillColor. If I was doing it I'd find a way to combine the timeseries in the upstream graphite query. There's a few ways graphite supports that kind of thing.

I don't think the panel functionality will be extended into this space. It really sounds like something that should be done in the datasource. Done at that level you could feed the value through the full gamut of grafana features, whilst done at the panel level you'd only be able to drive the picture.

madansu commented 4 months ago

Composite Metrics are nice, esp when attempting to combine metrics of disparate variety like the 3 listed above. That way the panel gets to decide the logic of how to render a cell color based on the logic provided, and the various permutations of deciding colors are abstracted from the user.

That said, I do understand and accept your decision.

Thanks again for your incredible work.

andymchugh commented 4 months ago

Keeping the discussion going, given the three metrics above, what would your pseudo code look like for combining them. i.e you have three values 50, 404, -1 that individually have their own thresholds/mappings that result in green, amber, yellow. If you could inject a bit of code to combine them to a new metric, what would that code look like?

I'd like to understand what your ideal endpoint would look like? Like are all threshold level 2's equivalent and so comparable plus level 3 is worse than level 2, etc.

xkilian commented 4 months ago

I personally would be content with ordering thresholds. Threshold 1: x=0 red, x=1 green, threshold 2: y=0 blue, y=1 red, y=2 green. If thresholds 1 is ok/green, then threshold 2 is evaluated and so on. This means that for each threshold, which is actually multiple values that correspond to colors. One of the value evaluations, would be deemed the nominal/ok value. It is not a composite metric, it is simply ordering the threshold evaluation. Real composites should be computed using in the ql or calculation engine of the tsdb, IMO. Simpler to implement, but gives a more flexible way to present multiple states for a given location/equipment/service. This is also flexible to account for metrics coming from multiple datasources.

madansu commented 4 months ago

the more i think about it, the more the phrase "composite metrics" seems incorrect. What would be awesome to have is a way to amalgamate metric colors, given the 3 disparate metrics above.

As @xkilian pointed out a simple, yet effective, technique would be to allow multiple metrics per cell, but bubble up the metric that is not "normal" (level 0)

For example - box1_cpu, -> thresholds of green: 0, orange: 60, red: 80 box1_icmp -> thresholds of green: 0, orange: 50, red: 75 box1_httpstat-> thresholds of green:200 , orange: 403, red: 500

Maybe always display the cell as an average of the 3 colors ? That way, what ever the color, the box would change color if the metric being measured rises above normal (green)

The method proposed by @xkilian would work as well.

bijwaard commented 4 months ago

You could use a grafana expression to create a metric that does the composition for you. This new metric could then be used to drive the color of your box / status indicator.

andymchugh commented 3 months ago

Support for this has been landed and will be in the next build. Whereas 1.13.0 allows you to define a single color (dataRef + thresholds), the new version allows you to define an array of colors. Thresholds across sets are compared using a new term on the threshold called 'order' which if not defined will default to the threshold array index. When comparing these 'order' terms you can choose to take the 'max' or the 'min'.

'avg' (i.e. blend of colors) hasn't been added because at this point in the code we are dealing with strings, not rgb's.

The new yaml fields are labelColorCompound, strokeColorCompound, fillColorCompound. There's example yaml in provisioning.

Say if you think this meets what you are after...

madansu commented 3 months ago

Thank you very much for the super quick turn around on this. I am curently away on vacation, and will try this once I am back at work, in the first week of July.

andymchugh commented 3 months ago

published in 1.14.0

xkilian commented 4 weeks ago

Hello Andrew, I had tested this a feature a while back. Unfortunately, my grafana test instance suffered data loss. Now, for the life of me, I cannot understand how to use it. Would it be possible to create an example of how to use fillColorCompound with multiple metrics. @madansu (Do you have an example to share of your use case?)

andymchugh commented 4 weeks ago

Sure, all the features are demonstrable via the provisioning dashboards with the files also living in provisioning/dashboardData. There's a link to the example in the june 9 comment https://github.com/andymchugh/andrewbmchugh-flow-panel/blob/main/provisioning/dashboardData/compoundColor.yaml

xkilian commented 4 weeks ago

Thank you, if I am not mistaken there is no link to it in the getting started page. I missed it when browsing the examples directory. Keep up the great work.