Closed lionel- closed 1 year ago
oops benchmarks above are with -g -O0
, the -O3
benchmarks are better:
bench::mark(
new = check_number_whole(1),
old = check_number_whole_old(1),
iterations = 100000
)
#> # A tibble: 2 × 13
#> expression min median `itr/sec` mem_al…¹ gc/se…² n_itr n_gc
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:by> <dbl> <int> <dbl>
#> 1 new 573.93ns 737.84ns 1238480. 0B 61.9 99995 5
#> 2 old 2.67µs 3.03µs 321286. 0B 51.4 99984 16
bench::mark(
new = check_number_decimal(1.5),
old = check_number_decimal_old(1.5),
iterations = 100000
)
#> # A tibble: 2 × 13
#> expression min median `itr/sec` mem_al…¹ gc/se…² n_itr n_gc
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:by> <dbl> <int> <dbl>
#> 1 new 614.91ns 738.07ns 1221700. 0B 61.1 99995 5
#> 2 old 2.34µs 2.71µs 359769. 0B 57.6 99984 16
I think it will help in lead()
and lag()
for sure. If we didn't make this change then after I add in faster vctrs assertions this would probably show up as one of the larger points of slowness
library(dplyr)
x <- 1:5 + 0L
bench::mark(lead(x), iterations = 100000)
# Main (dev dplyr, dev rlang, dev vctrs)
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 lead(x) 32.1µs 43.2µs 22647. 211KB 15.6
# With this PR and updated standalone file
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 lead(x) 21.6µs 28.7µs 34649. 131KB 15.3
# simulate many groups
bench::mark(for(i in 1:10000) lead(x), iterations = 10)
# Main (dev dplyr, dev rlang, dev vctrs)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 for (i in 1:10000) lead(x) 350ms 358ms 2.73 2.56MB 20.8
# With this PR and updated standalone file
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 for (i in 1:10000) lead(x) 226ms 264ms 3.82 2.56MB 19.5
@lionel- I think your M1 runs R code much faster than my Intel. So the switch to C seems more noticeable for me
x <- 1L
bench::mark(check_number_whole(x), iterations = 500000)
# Main
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 check_number_whole(x) 7.01µs 9.17µs 106014. 2.08KB 17.4
# This PR
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 check_number_whole(x) 1.99µs 2.34µs 402779. 0B 12.1
@DavisVaughan GET A NEW LAPTOP
I was hoping for more of a speed-up, especially given how much effort it is to write these things in C. I guess the R versions are already quite fast to begin with. Will this be enough to fix the perf issues @DavisVaughan?
FYI @hadley