oetiker / rrdtool-2.x

RRDtool 2.x - The Time Series Database
86 stars 8 forks source link

Sane time interval defaults for "fetch" #37

Open xtaran opened 9 years ago

xtaran commented 9 years ago

rrdtool fetch including at least its Perl interfaces RRDs and RRDTool::OO by default only fetch data from the past 24 hours regardless of whether there are data sets from the past 24 hours in the RRD or not.

From my point of view, sane defaults would be

If fetch should be backward-compatible, i.e. changing these defaults is out of scope, I strongly recommend adding a new rrdtool subcommand which does the same, but with saner default values for time intervals.

oetiker commented 9 years ago

Does that mean, that if I setup an RRA with 1s interval data for the last 5 years, and then do a fetch without specifying the time range I am interested in, you would like for rrdtool to give me all the 5*365*24*60*60 data values?

xtaran commented 9 years ago

Yes. If I have such a huge RRA, I would specify the interval I'm interested in from the very beginning -- or actually would like to have all 158 million datasets.

oetiker commented 9 years ago

Hmm could it be that you are actually looking for rrdtool dump ? The idea of fetch is that it will give you the data best matching your request. And if you don't specify, there are default.

xtaran commented 9 years ago

Not really. I wanted to iterate over all contents in a week old .rrd file to extract it entry by entry. I totally didn't expect that it by default only gave me data from timestamps which are not even stored in the .rrd file. XML is not an appropriate intermediate format for that IMHO.

xtaran commented 9 years ago

With the example given above, I do see that maybe "all the values from the last 24 hours of those hours stored in the file" may be a sane default, too. But IMHO using a time interval which is not even present in the RRA doesn't make much sense to me.

I do see that when plotting graphs for monitoring applications, you usually want recent data. But IMHO this is logic which should be implemented in the graph plotting part, not in the data fetching part.

xtaran commented 9 years ago

Neither RRDs::dump nor RRDTool::OO->dump() do really help much as they seems to print the dump to STDOUT instead of returning it as string or hashref. So no, it's not really an alternative for iterating over the values, even if I would try parsing the XML (which has the time stamps only in comments, i.e. they're ignored/dropped/lost with most XML parsers).

Maybe we should discuss this topic in real-life on one of the next Swiss Perl Community Meetups or at the Swiss Perl Workshop. That's probably easier. :-)

oetiker commented 9 years ago

hihi sure :)