Rdatatable / data.table

R's data.table package extends data.frame:
http://r-datatable.com
Mozilla Public License 2.0
3.59k stars 978 forks source link

frollmax is slow on descending sequences #5923

Open davidcsterratt opened 8 months ago

davidcsterratt commented 8 months ago

# Minimal reproducible example; please be sure to set verbose=TRUE where possible!

if (!require("devtools")) install.packages("devtools")
if (!require("RCRoll")) devtools::install_github("davidcsterratt/RCRoll")
if (!require("roll")) install.packages("roll")
if (!require("data.table")) devtools::install_github("Rdatatable/data.table", branch="frollmax")

benchmark <- function(n, k, func) {
  x = func(n)
  start_time = Sys.time(); y.RCRoll = RCRoll::rollmin(x, k) ; print(paste("RCRoll:", Sys.time() - start_time))
  start_time = Sys.time(); y.roll = roll::roll_min(x, k) ; print(paste("roll:", Sys.time() - start_time))
  start_time = Sys.time(); y.data.table = data.table::frollmax(x, k) ; print(paste("data.table:", Sys.time() - start_time))
   return(list("RCRoll"=y.RCRoll, "roll"=y.roll, "data.table"=y.data.table))
}

n <- 1E7
k <- 1E3

# Random normal
out1 <- benchmark(n, k, rnorm)
# All equal to 0
out2 <- benchmark(n, k, function(n) {rep(0, n)})
# Ascending
out3 <- benchmark(n, k, function(n) {seq(0, length.out=n)})
# Descending
out4 <- benchmark(n, k, function(n) {seq(0, by=-1, length.out=n)})

Output of the benchmarking code:

[1] "RCRoll: 0.169624805450439"
[1] "roll: 0.153290987014771"
[1] "data.table: 0.0348567962646484"
[1] "RCRoll: 0.0276196002960205"
[1] "roll: 0.0650320053100586"
[1] "data.table: 0.034125804901123"
[1] "RCRoll: 0.0440917015075684"
[1] "roll: 0.0651366710662842"
[1] "data.table: 0.0270349979400635"
[1] "RCRoll: 0.0287351608276367"
[1] "roll: 0.0672793388366699"
[1] "data.table: 3.9090781211853"

data.table::frollmax() is considerably slower for descending sequences.

Note that RCRoll is an unpublished package I wrote based on the late Richard Harter's ascending minimum algorithm before I realised that the roll package seemed to be about as efficient (at least on one core). Note also that I've not compared the results and I've used the rolling min function from roll.

# Output of sessionInfo()

> sessionInfo()
R Under development (unstable) (2023-12-29 r85751)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/local/lib/R/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/London
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.14.99 RcppRoll_0.3.0     zoo_1.8-12         roll_1.1.6        
[5] RCRoll_0.1         devtools_2.4.5     usethis_2.2.2     

loaded via a namespace (and not attached):
 [1] miniUI_0.1.1.1     compiler_4.4.0     promises_1.2.1     Rcpp_1.0.12       
 [5] stringr_1.5.1      later_1.3.2        fastmap_1.1.1      lattice_0.20-45   
 [9] mime_0.12          R6_2.5.1           htmlwidgets_1.6.4  profvis_0.3.8     
[13] shiny_1.8.0        rlang_1.1.3        cachem_1.0.8       stringi_1.8.3     
[17] httpuv_1.6.13      fs_1.6.3           RcppParallel_5.1.7 pkgload_1.3.3     
[21] memoise_2.0.1      cli_3.6.2          magrittr_2.0.3     grid_4.4.0        
[25] digest_0.6.34      xtable_1.8-4       remotes_2.4.2.1    lifecycle_1.0.4   
[29] vctrs_0.6.5        glue_1.7.0         urlchecker_1.0.1   sessioninfo_1.2.2 
[33] pkgbuild_1.4.3     purrr_1.0.2        tools_4.4.0        ellipsis_0.3.2    
[37] htmltools_0.5.7   

P.S. @jangorecki This issue was prompted by your nice talk at EdinbR on Friday - I was the person who said I'd follow up after the meeting.

jangorecki commented 8 months ago

Thank you for raising that and reproducible code.

Implementation of rolling min and max was not following any paper, it was simply my idea how to extend online algo from rolling mean to make it support min/max. It turned to be good enough so there was no need to look for better algos. Therefore there is a space for improvement here I believe.

What is good is how narrow are data cases which ends up to have problems: only desc sequence (for max, and asc for min). To not reimplement algo, we could even reverse input and swap align arg, and then reverse results. Of course better to have proper algo.

Manual (possibly in rollmedian branch and not frollmax) already explains edge case where naive approach can be faster than this one. We could possibly add example how to work around of this.

MichaelChirico commented 8 months ago

Does this only apply in the case of monotone decreasing sequences? it would also be interesting to see something like

y=-mx+eps, eps~N(0, s)

and compare performance as s->0.

if performance degrades more and more, that's much more worrisome than if it's only an issue exactly in the s=0 edge case

jangorecki commented 8 months ago

Whenever next element is always smaller, then it will be same bad as N:1

ben-schwen commented 8 months ago

Whenever next element is always smaller, then it will be same bad as N:1

So from an algorithmic point of view, it will be slow when we have a lot of inversions?

jangorecki commented 8 months ago

It depends where the inversions are relatively to windows size and max location within it. Decreasing sequence is an extreme case.

Number or nested finding max calls is reported with verbose, at least in rollmedian branch.

davidcsterratt commented 8 months ago

Glad that the report was helpful. I 'm not expert in these algorithms, but my main observation is that the rolling max/min is surprisingly tricky compared to the rolling mean case! (Before doing some reading online, I started trying to write my own, and got stuck.)

In terms of other test cases, it might be interesting to have some pink noise, i.e. like the rnorm case, but smoothed by a filter. It might also be interesting to check the scaling with the sequence length and the window length.

Anyway, all the best with finding a good solution.

jangorecki commented 8 months ago

Probably algorithm used in median could be adapted for min/max as well, but I don't think itnis worth the effort