Val suggested two improvements to decrease computing times and storage requirements:
Lossless compression of the stage time series: if 3 values are aligned, remove the middle one. Val tried it for a 25-year, 10-min stage series, and went down from 870 309 to 256 302 points - with no loss!
Transformation into discharge by value rather than by time step. Because of rounding, the stage time series above containing 870 309 time steps only contains 4004 distinct stage values! It would therefore be much more efficient to transform only these 4004 values, and then put them back in the correct order to rebuild the time series. It's actually a bit more complicated than that because of stage errors, but it's certainly feasible to round the noisy stage values using the same resolution as the original stage data.
More generally, this raises the question to offer some pre-processing tools for input datasets, in relation with issue #26. Subsampling tools would be particularly valuable - but is it BaRatinAGE's job, or should rather be done by some external app?
Val suggested two improvements to decrease computing times and storage requirements:
More generally, this raises the question to offer some pre-processing tools for input datasets, in relation with issue #26. Subsampling tools would be particularly valuable - but is it BaRatinAGE's job, or should rather be done by some external app?