Closed MattCowgill closed 3 years ago
A friend of mine ran the same code above and gets different results, with slide_mean()
slightly faster than zoo::rollapply()
. That makes me wonder if this is some weird M1 Mac issue.
Interesting, here is what I get on my 2018 Intel Mac running Mojave
library(dplyr)
size <- 100000
x <- tibble(num = rnorm(size, mean = 10, sd = 2),
letters = sample(letters, size, replace = T))
f_slider <- function(data) {
data %>%
group_by(letters) %>%
mutate(mean = slider::slide_mean(x = num,
before = 11L,
complete = TRUE))
}
f_zoo <- function(data) {
data %>%
group_by(letters) %>%
mutate(mean = zoo::rollmeanr(num, 12, fill = NA))
}
bench::mark(f_slider(x),
f_zoo(x))
#> # A tibble: 2 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 f_slider(x) 8.67ms 9.98ms 98.8 7.13MB 12.0
#> 2 f_zoo(x) 28.53ms 28.7ms 33.9 30.06MB 170.
It is possible this has to do with how efficiently your machine handles long double
s, but I'm not entirely sure
Could you try some benchmarks with slide_max()
against rollmax()
? That doesn't use long doubles.
And then again with slide_sum()
against rollsum()
? That uses long doubles, but in a slightly simpler way.
Hi @DavisVaughan I have tried upgrading to the native arm64 build of R 4.1.0. slide_mean()
is now extremely fast for me. Thank you - perhaps there is something about the Rosetta emulation on M1 Macs running x86 R that slows slider
down in that situation.
library(tidyverse)
size <- 100000
x <- tibble(num = rnorm(size, mean = 10, sd = 2),
letters = sample(1L:26L, size, replace = T)
)
f_slider <- function(data) {
data %>%
group_by(letters) %>%
mutate(mean = slider::slide_mean(x = num,
before = 11L,
complete = TRUE))
}
f_zoo <- function(data) {
data %>%
group_by(letters) %>%
mutate(mean = zoo::rollmeanr(num, k = 12L, fill = NA))
}
bench::mark(f_slider(x),
f_zoo(x))
#> # A tibble: 3 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 f_slider(x) 4.8ms 4.98ms 199. 6.34MB 35.3
#> 2 f_zoo(x) 12.32ms 12.99ms 77.4 29.54MB 294.
Thanks again for a great package
It does seem that the Rosetta 2 emulation uses extended precision 80-bit long doubles (which is what the Intel Mac uses), but native ARM supports only 64-bit long doubles (i.e. they are the same as a typical double).
My Mac also uses 80-bit long doubles since it is Intel, but is pretty fast, so maybe there is something strange going on in the Rosetta 2 emulation as you mentioned.
Search "Rosetta 2" here: https://stardot.org.uk/forums/viewtopic.php?t=22495
But there's one gotcha, which nobody (except me) ever seems to mention: ARM currently has no hardware support for floating-point arithmetic with a better precision than 64-bits ('double') whereas x86 has 80-bit floats ('long double'). I can't be alone in having applications which need better than 64-bit precision, typically because many calculations get chained and losing half-an-LSB at each step isn't acceptable. One such application, FIRBBC (which synthesises Finite Impulse Response filters), simply doesn't work reliably with 64-bit floats. So unless and until ARM supports something better than 64-bit floats it can't compete with x86 in some critical applications. It's ironic that Acorn's own early designs for a floating-point ARM coprocessor did support 80-bit floats, but that didn't survive integration with the main CPU. Admittedly Apple's Rosetta 2 emulation, which runs x86 code on the M1, does properly support 80-bit long doubles (in itself an impressive feat) and is a partial solution, but speed is obviously impacted quite significantly.
Hi @DavisVaughan, First: I love
{slider}
, thank you for making it.I'm keen to replace various other functions in my code with their
{slider}
equivalents. One problem I have is thatzoo::rollmeanr()
is faster (for me at least) thanslider::slide_mean()
. Here is an example:Created on 2021-06-09 by the reprex package (v2.0.0)
Session info
``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.5 (2021-03-31) #> os macOS Big Sur 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Melbourne #> date 2021-06-09 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2) #> backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.2) #> bench 1.1.1 2020-01-13 [1] CRAN (R 4.0.2) #> cli 2.5.0 2021-04-26 [1] CRAN (R 4.0.5) #> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.2) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.0.2) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2) #> dplyr * 1.0.6 2021-05-05 [1] CRAN (R 4.0.5) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.0.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.1) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.0.2) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> generics 0.1.0 2020-10-31 [1] CRAN (R 4.0.2) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> highr 0.9 2021-04-16 [1] CRAN (R 4.0.2) #> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2) #> knitr 1.33 2021-04-24 [1] CRAN (R 4.0.2) #> lattice 0.20-41 2020-04-02 [1] CRAN (R 4.0.5) #> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.2) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.2) #> pillar 1.6.1 2021-05-16 [1] CRAN (R 4.0.5) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2) #> profmem 0.6.0 2020-12-13 [1] CRAN (R 4.0.2) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.2) #> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2) #> reprex 2.0.0 2021-04-02 [1] CRAN (R 4.0.2) #> rlang 0.4.11 2021-04-30 [1] CRAN (R 4.0.2) #> rmarkdown 2.8 2021-05-07 [1] CRAN (R 4.0.2) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.0.2) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2) #> slider 0.2.1 2021-03-23 [1] CRAN (R 4.0.2) #> stringi 1.6.2 2021-05-17 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) #> styler 1.4.1 2021-03-30 [1] CRAN (R 4.0.2) #> tibble 3.1.2 2021-05-16 [1] CRAN (R 4.0.2) #> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.0.2) #> utf8 1.2.1 2021-03-12 [1] CRAN (R 4.0.2) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.0.2) #> warp 0.2.0 2020-10-21 [1] CRAN (R 4.0.2) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.0.5) #> xfun 0.23 2021-05-15 [1] CRAN (R 4.0.2) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2) #> zoo 1.8-9 2021-03-09 [1] CRAN (R 4.0.2) #> #> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library ```I'm not clear whether the problem is with me (is there something in the example above I should change?) or if
slide_mean()
is just a bit slower thanrollmean
.Thanks again