r-lib / sparsevctrs

Sparse vector class using ALTREP
https://r-lib.github.io/sparsevctrs/
Other
12 stars 1 forks source link

Add some sparse helpers - mean, sd, var, median #79

Closed EmilHvitfeldt closed 2 weeks ago

EmilHvitfeldt commented 3 weeks ago

Ref: #49

They are currently written in R, but they beat the default implementation for common data sizes.

They also have early exits for completely sparse vectors

library(sparsevctrs)

x <- sparse_double(c(10, 50, 11), c(1, 50, 111), 1000)

bench::mark(
  dense = mean(x),
  sparse = sparse_mean(x)
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 dense       35.47µs   35.8µs    27010.    23.3KB     2.70
#> 2 sparse       1.48µs    1.6µs   574807.    19.2KB    57.5

x <- sparse_double(c(10, 50, 11), c(1, 50, 111), 100000)

bench::mark(
  dense = mean(x),
  sparse = sparse_mean(x)
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 dense        3.42ms   3.46ms      288.        0B        0
#> 2 sparse       1.44µs   1.56µs   579646.        0B        0