infrawatch / telemetry-framework

Telemetry Framework contains the installation tools for delivery of the Service Assurance Framework [Tech Preview]
Apache License 2.0
10 stars 10 forks source link

[Dashboarding] Zero values showing up in df timeseries in prometheus #113

Open pleimer opened 4 years ago

pleimer commented 4 years ago

Putting this here before I forget this issue exists.

Certain dashboarding components utilize division in queries to calculate percentages. If a divisor returns zero, Prometheus returns a 100% value (or that is how Grafana shows division error). The root of the problem, then, is that a time series has a zero value in it in the first place.

I see this when graphing some of the metrics from the df collectd plugin. For example, here is an erratic zero causing the very graphing issue described above, where disk usage reported by the collectd df plugin drops from 15GB to 0 in a single instant and then back up:

image

On a wider scale, it is easier to see why this is problematic: image

This may cause alarms to be triggered and cause confusion for the end user.

pleimer commented 4 years ago

So far I only see this for values provided by the df plugin.

csibbitt commented 4 years ago

Check to see if you are really getting 0's in the source data from prometheus, or whether they are simply being supplied by grafana in response to a missing sample.

image

pleimer commented 4 years ago

Thanks for pointing that out. Unfortunately setting it to all of the options did not make a difference