r-lib / rlang

Low-level API for programming with R
https://rlang.r-lib.org
Other
507 stars 139 forks source link

Implement `check_bool()` in C #1559

Closed lionel- closed 1 year ago

lionel- commented 1 year ago

POC for implementing a function in a standalone file in C.

hadley commented 1 year ago

How does the performance look?

lionel- commented 1 year ago

@hadley This is only a POC with the simplest of the checker so I wasn't expecting a large performance gain. In a simple check the gain is negligible:

bench::mark(
  new = check_bool(TRUE),
  old = check_bool_old(TRUE),
  iterations = 10000
)
#> # A tibble: 2 × 13
#>   expre…¹      min median itr/s…² mem_a…³ gc/se…⁴ n_itr  n_gc total…⁵
#>   <bch:e> <bch:tm> <bch:>   <dbl> <bch:b>   <dbl> <int> <dbl> <bch:t>
#> 1 new     861.01ns 1.31µs 646621.      0B     0   10000     0  15.5ms
#> 2 old       1.11µs 1.39µs 700259.      0B    70.0  9999     1  14.3ms

However with a slightly more complex test (allowing NA) we see a benefit:

bench::mark(
  new = check_bool(NA, allow_na = TRUE),
  old = check_bool_old(NA, allow_na = TRUE),
  iterations = 10000
)
#> # A tibble: 2 × 13
#>   expre…¹      min median itr/s…² mem_a…³ gc/se…⁴ n_itr  n_gc total…⁵
#>   <bch:e> <bch:tm> <bch:>   <dbl> <bch:b>   <dbl> <int> <dbl> <bch:t>
#> 1 new     901.99ns 1.44µs 577491.      0B     0   10000     0  17.3ms
#> 2 old       1.97µs 2.42µs 405933.      0B    40.6  9999     1  24.6ms

The gains will scale with the complexity of the checks. So it will be worth it for check_number_whole() and check_number_decimal() which are the ones that cause performance issues in dplyr.