Closed lionel- closed 1 year ago
How does the performance look?
@hadley This is only a POC with the simplest of the checker so I wasn't expecting a large performance gain. In a simple check the gain is negligible:
bench::mark(
new = check_bool(TRUE),
old = check_bool_old(TRUE),
iterations = 10000
)
#> # A tibble: 2 × 13
#> expre…¹ min median itr/s…² mem_a…³ gc/se…⁴ n_itr n_gc total…⁵
#> <bch:e> <bch:tm> <bch:> <dbl> <bch:b> <dbl> <int> <dbl> <bch:t>
#> 1 new 861.01ns 1.31µs 646621. 0B 0 10000 0 15.5ms
#> 2 old 1.11µs 1.39µs 700259. 0B 70.0 9999 1 14.3ms
However with a slightly more complex test (allowing NA
) we see a benefit:
bench::mark(
new = check_bool(NA, allow_na = TRUE),
old = check_bool_old(NA, allow_na = TRUE),
iterations = 10000
)
#> # A tibble: 2 × 13
#> expre…¹ min median itr/s…² mem_a…³ gc/se…⁴ n_itr n_gc total…⁵
#> <bch:e> <bch:tm> <bch:> <dbl> <bch:b> <dbl> <int> <dbl> <bch:t>
#> 1 new 901.99ns 1.44µs 577491. 0B 0 10000 0 17.3ms
#> 2 old 1.97µs 2.42µs 405933. 0B 40.6 9999 1 24.6ms
The gains will scale with the complexity of the checks. So it will be worth it for check_number_whole()
and check_number_decimal()
which are the ones that cause performance issues in dplyr.
POC for implementing a function in a standalone file in C.
The FFI symbol is exported and we'll commit to supporting it for a full life cycle. If deprecated, it will go through a normal deprecation process ~using the lifecycle package~ (striked that because it might not be available).
If a new version is needed for an interface change, we'll export a new versioned FFI symbol.