OpenTSDB / opentsdb

A scalable, distributed Time Series Database.
http://opentsdb.net
GNU Lesser General Public License v2.1
5k stars 1.25k forks source link

Problem with Downsampling #1295

Open markalavin opened 6 years ago

markalavin commented 6 years ago

I have the following query:

{"start":"1533189600000","end":"1533222648727","timezone":null,"options":null,"padding":false,"queries":[{"aggregator":"none","metric":"active_pwr.386905.na.live.eda","tsuids":null,"downsample":"0all-sum","filters":[{"tagk":"assetId","filter":"367446","group_by":false,"type":"literal_or"}],"rate":false,"rateOptions":null,"explicitTags":false}],"delete":false,"msResolution":true,"noAnnotations":false,"globalAnnotations":false,"showTSUIDs":false,"showQuery":false,"showStats":false,"showSummary":false,"useCalendar":false} 

Doing the math, the start - end interval is around 9.18 hours. My expectation for the statement above using 0all-sum is that it should be summing the values for each metric:tag time series(there is only one) for that 9.18 hour. However, if I change the downsampler to 9h-sum, expecting to get roughly the same value, it's different, and yet different if I use 33048s-sum. The results are:

0all-sum        860837
9h-sum       421301
33048s-sum  624229

I notice the same kind of behavior when I change the downsample operation from sum to avg.

Is this the intended behavior? It sure doesn't seem to me that it should work this way.

I tried an experiment, outputting the values from the time series without any downsampling, then added those up and counted them using awk; it turns out that the result (sum / count ) agrees exactly with the "0all-avg" downsampled version.

burk commented 6 years ago

I think your downsampling interval does not necessarily align with the start of your query interval (since 24h is not a multiple of 9h), see: http://opentsdb.net/docs/build/html/user_guide/query/downsampling.html

gnydick commented 5 years ago

I'm having extremely different values

metric named: widget_size (a gauge for every widget processed) no filters or tags agg: sum ds-agg: sum

"last 7 days"

ds-disabled: total=35.333466 TiB ds=1h: total=579.75 GiB ds=6h: total=220.48 GiB ds=12h: total=149.60 GiB ds=24h: total=102.21 GiB ds=3.5d: total=75.0 GiB

ds=0all: total=105.3 GiB

I've gone through the data by hand, calculated it all out external to the database and the 35.33TiB is correct. I've checked all of the tags, that there are no duplicated datapoints, etc. I really spent a long time digging into this.

Shouldn't agg:sum, ds-agg:sum always result in the same sums no matter what the ds period is, within reason?

manolama commented 5 years ago

@markalavin @markalavin is right there in that the downsampler will attempt to align on a "useful" offset from the start of the day, in your case, 9 hours after midnight. Then it will sum up everything in that 9hr bucket. You should see different timestamps based on the query.

manolama commented 5 years ago

@gnydick can you paste the full query? 0all should have given you the 35TiB but the others depend on the query range.