Open davidcsterratt opened 8 months ago
Thank you for raising that and reproducible code.
Implementation of rolling min and max was not following any paper, it was simply my idea how to extend online algo from rolling mean to make it support min/max. It turned to be good enough so there was no need to look for better algos. Therefore there is a space for improvement here I believe.
What is good is how narrow are data cases which ends up to have problems: only desc sequence (for max, and asc for min). To not reimplement algo, we could even reverse input and swap align arg, and then reverse results. Of course better to have proper algo.
Manual (possibly in rollmedian branch and not frollmax) already explains edge case where naive approach can be faster than this one. We could possibly add example how to work around of this.
Does this only apply in the case of monotone decreasing sequences? it would also be interesting to see something like
y=-mx+eps, eps~N(0, s)
and compare performance as s->0.
if performance degrades more and more, that's much more worrisome than if it's only an issue exactly in the s=0 edge case
Whenever next element is always smaller, then it will be same bad as N:1
Whenever next element is always smaller, then it will be same bad as N:1
So from an algorithmic point of view, it will be slow when we have a lot of inversions?
It depends where the inversions are relatively to windows size and max location within it. Decreasing sequence is an extreme case.
Number or nested finding max calls is reported with verbose, at least in rollmedian branch.
Glad that the report was helpful. I 'm not expert in these algorithms, but my main observation is that the rolling max/min is surprisingly tricky compared to the rolling mean case! (Before doing some reading online, I started trying to write my own, and got stuck.)
In terms of other test cases, it might be interesting to have some pink noise, i.e. like the rnorm case, but smoothed by a filter. It might also be interesting to check the scaling with the sequence length and the window length.
Anyway, all the best with finding a good solution.
Probably algorithm used in median could be adapted for min/max as well, but I don't think itnis worth the effort
#
Minimal reproducible example
; please be sure to setverbose=TRUE
where possible!Output of the benchmarking code:
data.table::frollmax()
is considerably slower for descending sequences.Note that
RCRoll
is an unpublished package I wrote based on the late Richard Harter's ascending minimum algorithm before I realised that theroll
package seemed to be about as efficient (at least on one core). Note also that I've not compared the results and I've used the rolling min function fromroll
.#
Output of sessionInfo()
P.S. @jangorecki This issue was prompted by your nice talk at EdinbR on Friday - I was the person who said I'd follow up after the meeting.