oetiker / rrdtool-2.x

RRDtool 2.x - The Time Series Database
88 stars 8 forks source link

Respect variable month length in data aggregation #16

Open lafar6502 opened 11 years ago

lafar6502 commented 11 years ago

Currently rrdtool uses fixed size of aggregation period, which makes it impossible to calculate monthly statistics. It would be great if there would be an option to specify monthly aggregation (from 1st to last day of the month)

oetiker commented 11 years ago

about a year ago, I proposed the following

in a customer project the 'localtime' question has come up regarding graphs and consolidation.

rrdtool internally works in GMT, so a day in rrdtool is always GMT aligned. If you define RRAs for 1 day intervals, this is what you will get. With the world growing smaller, I still think this is a good thing and I do not want to change it.

BUT for presentation it looks rather odd, when daily averages are not aligned to the local 'idea' of a day.

So if you store hourly data in your RRAs, it would be nice if rrdtool could build daily averages for 'local' days on the fly.

My idea is, to provide some additional parameters to the DEF function in rrd graph to allow it to 'massage' the incoming data into daily, weekly, monthly 'portions'. To make things as flexible as possible, I think of employing the strftime function as a trigger mechanism like that:

DEF:weekly=test.rrd:ifHCInOctets:AVERAGE:step=3600:ctrigger=%V:tz=CET

(%V would be the ISO week number, but you could use any expression you want).

Whenever the value of the ctrigger changes, one 'grouping' comes to an end. If the original data is not available at 'step' resolution from rrd fetch, some interpolation might arificially enhance the resolution prior the blocking it again.

I have not implemented anyting yet ... so please discuss

jfesler commented 11 years ago

So if you store hourly data in your RRAs, it would be nice if rrdtool could build daily averages for 'local' days on the fly.

I know my $dayjob would love to see this feature, as they operate products around the world, and all stats for "Days" are GMT. Their concept of peak for a day gets a bit wonky depending on the market.

lafar6502 commented 11 years ago

Tobi, from what I understood I think I like your idea - this should be pretty powerful. There's however, one thing related to my another feature request: I wanted rrdtool to provide current graph resolution so it could be used in RPN calculations. With variable aggregation periods the resolution will no longer be constant for the whole graph so this needs some rethinking. I need the resolution to be able to calculate sums for each aggregation period and I do it as you advised - by multiplying the average value calculated by RRDtool by graph time resolution. So, maybe without playing with a variable resolution we could just add SUM consolidation function to rrdtool and make it cooperate nicely with the new aggregation mechanism?

oetiker commented 11 years ago

can you describe your usecase in a bit more detail ? because if you want the average for single local day, you simply have to request a graph for that time range ... this works fine already ...

what I am trying todo here, is that if you create a grpah where the line on the graph is a staircase, stepping at midnight ... presently the step will happen at GMT midnight, afterwards, the step would occur at local midnight ...