yutannihilation / savvy

A simple R extension interface using Rust
https://yutannihilation.github.io/savvy/guide/
MIT License
75 stars 4 forks source link

Performance differences observed in r-polars (?) #293

Open eitsupi opened 2 months ago

eitsupi commented 2 months ago

I am not sure if this stems from the difference between extendr and savvy, so apologies if this is completely unrelated.

When comparing the already existing polars binding using extendr (polars) to the rewritten polars binding using savvy (neopolars), I noticed that the latter was orders of magnitude slower on both vector inputs and outputs.

https://github.com/pola-rs/r-polars/issues/1079#issuecomment-2331577275

# Construct an Arrow array from an R vector
long_vec_1 <- 1:10^6

bench::mark(
  arrow = {
    arrow::as_arrow_array(long_vec_1)
  },
  nanoarrow = {
    nanoarrow::as_nanoarrow_array(long_vec_1)
  },
  polars = {
    polars::as_polars_series(long_vec_1)
  },
  neopolars = {
    neopolars::as_polars_series(long_vec_1)
  },
  check = FALSE,
  min_iterations = 5
)
#> # A tibble: 4 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 arrow        2.62ms   2.92ms     328.    19.82MB     2.04
#> 2 nanoarrow  496.13µs 644.87µs    1252.   458.41KB     2.03
#> 3 polars       2.06ms   2.26ms     405.     6.33MB     0
#> 4 neopolars    84.6ms   90.1ms      10.9    1.59MB     0

# Export Arrow data as an R vector
arrow_array_1 <- arrow::as_arrow_array(long_vec_1)
nanoarrow_array_1 <- nanoarrow::as_nanoarrow_array(long_vec_1)
polars_series_1 <- polars::as_polars_series(long_vec_1)
neopolars_series_1 <- neopolars::as_polars_series(long_vec_1)

bench::mark(
  arrow = {
    as.vector(arrow_array_1)
  },
  nanoarrow = {
    as.vector(nanoarrow_array_1)
  },
  polars = {
    as.vector(polars_series_1)
  },
  neopolars = {
    as.vector(neopolars_series_1)
  },
  check = TRUE,
  min_iterations = 5
)
#> # A tibble: 4 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 arrow       13.94µs  15.84µs  46309.      4.59KB     4.63
#> 2 nanoarrow   559.9µs   1.85ms    513.      3.85MB    72.8
#> 3 polars       6.45ms   8.79ms    112.      5.93MB     9.13
#> 4 neopolars  148.82ms 164.65ms      6.02    5.24MB     0

Created on 2024-09-05 with reprex v2.1.1

If you could give me some advice on how to improve the performance in any way I would appreciate it.

yutannihilation commented 2 months ago

Indeed neopolars is slower, but it seems it's not that slow on my Windows. Both polars and neopolars are freshly installed from GitHub by pak::pkg_install().

long_vec_1 <- 1:10^6

bench::mark(
  polars = {
    polars::as_polars_series(long_vec_1)
  },
  neopolars = {
    neopolars::as_polars_series(long_vec_1)
  },
  check = FALSE,
  min_iterations = 5
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 polars      196.1µs   1.11ms      833.   10.11MB     2.01
#> 2 neopolars    3.13ms   6.43ms      149.    1.03MB     0

polars_series_1 <- polars::as_polars_series(long_vec_1)
neopolars_series_1 <- neopolars::as_polars_series(long_vec_1)

bench::mark(
  polars = {
    as.vector(polars_series_1)
  },
  neopolars = {
    as.vector(neopolars_series_1)
  },
  check = TRUE,
  min_iterations = 5
)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 polars       4.08ms   4.62ms      186.    5.85MB     22.0
#> 2 neopolars    7.18ms   8.53ms      117.    4.56MB     22.5

Created on 2024-09-06 with reprex v2.1.1

eitsupi commented 2 months ago

Thanks for taking a look at this! Perhaps the difference in my benchmark result could have been spread by different optimizations at build time in my installation process......

But even your results seem to show a difference of about 5x in construction and 2x in export, so I am wondering where the difference comes from.

yutannihilation commented 2 months ago

This repository is for checking if savvy is sufficiently fast, not for competing with extendr. I think a few ms is fast enough. Let's worry about the performance when we hit a problem with more real usages.

yutannihilation commented 2 months ago

One possible factor that might affect such a benchmark is that savvy always expands ALTREP vectors.

https://yutannihilation.github.io/savvy/guide/key_ideas.html#treating-external-sexp-and-owned-sexp-differently

In the code above, an ALTREP is created only once, so this shouldn't affect. But, future benchmark might show some bottleneck related to this.

# Construct an Arrow array from an R vector
long_vec_1 <- 1:10^6
daniellga commented 2 months ago

Wouldn't it be desirable for both projects to have a comparison benchmark? So we all could know if any update would result in a performance regression. I remember doing a simple one when switching to savvy and IIRC savvy was only a bit slower, nothing to worry about IMO...