First, here is a benchmark to show that the nested unwind_protect() optimization isn't working, and the performance loss is pretty brutal. The unwind_protect() calls happen on every iteration of the loop, because assigning an r_string to a writable::strings proxy element is wrapped in unwind_protect(), even though I wrapped the whole loop in unwind_protect() too.
https://github.com/r-lib/cpp11/blob/2aba46eb40ef6d13f1df66f1626afe82dbdf892a/inst/include/cpp11/strings.hpp#L58
set.seed(123)
x <- sample(letters, 1e6, replace = TRUE)
cpp11::cpp_function("
cpp11::writable::strings test(cpp11::strings x) {
R_xlen_t size = x.size();
cpp11::writable::strings out(size);
cpp11::unwind_protect([&] {
for (R_xlen_t i = 0; i < size; ++i) {
out[i] = x[i];
}
});
return out;
}
")
cpp11::cpp_function("
SEXP test2(SEXP x) {
R_xlen_t size = Rf_xlength(x);
SEXP out = PROTECT(Rf_allocVector(STRSXP, size));
for (R_xlen_t i = 0; i < size; ++i) {
SET_STRING_ELT(out, i, STRING_ELT(x, i));
}
UNPROTECT(1);
return out;
}
")
bench::mark(test(x), test2(x), iterations = 20)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 test(x) 715.35ms 751.55ms 1.31 7.63MB 1.96
#> 2 test2(x) 7.65ms 9.11ms 100. 7.63MB 35.1
Next, here is a video that shows how inner calls to unwind_protect() aren't exiting early. It uses VS Code to step through the example from above. The sequence of steps are:
We step through the outer unwind_protect() call to see how it sets up the static should_unwind_protect local object
Then we step through the first iteration of the out[i] = x[i] copy. That puts us back in unwind_protect(), but that static local object is not yet initialized (weird right?), so we end up going back through the process of setting it up. That means unwind_protect() doesn't early exit here when it should.
Then on the second iteration of out[i] = x[i], the static local should_unwind_protectis set up, but is set to TRUE so we again can't early exit. This continues on for the rest of the loop.
I'm fairly certain the main problem here is that SEXP unwind_protect(Fun&& code) is a template function, meaning that each new instantiation of it gets _its own copy of should_unwind_protect_, and it has to be set up every time.
I've fixed this in a branch I'll send in, and I'll explain the fix there.
I believe that the
unwind_protect()
optimization originally added in https://github.com/r-lib/cpp11/pull/207 isn't quite right. I think it was broken after that PR was merged in https://github.com/r-lib/cpp11/commit/9a62c3a420b076d6ad48c291039421ff739dcc58 with the move to a local static object. I can prove that nested calls tounwind_protect()
aren't being skipped as they should be.First, here is a benchmark to show that the nested
unwind_protect()
optimization isn't working, and the performance loss is pretty brutal. Theunwind_protect()
calls happen on every iteration of the loop, because assigning anr_string
to awritable::strings
proxy element is wrapped inunwind_protect()
, even though I wrapped the whole loop inunwind_protect()
too. https://github.com/r-lib/cpp11/blob/2aba46eb40ef6d13f1df66f1626afe82dbdf892a/inst/include/cpp11/strings.hpp#L58Next, here is a video that shows how inner calls to
unwind_protect()
aren't exiting early. It uses VS Code to step through the example from above. The sequence of steps are:unwind_protect()
call to see how it sets up the staticshould_unwind_protect
local objectout[i] = x[i]
copy. That puts us back inunwind_protect()
, but that static local object is not yet initialized (weird right?), so we end up going back through the process of setting it up. That meansunwind_protect()
doesn't early exit here when it should.out[i] = x[i]
, the static localshould_unwind_protect
is set up, but is set toTRUE
so we again can't early exit. This continues on for the rest of the loop.https://vimeo.com/769942052
I'm fairly certain the main problem here is that
SEXP unwind_protect(Fun&& code)
is a template function, meaning that each new instantiation of it gets _its own copy ofshould_unwind_protect
_, and it has to be set up every time.I've fixed this in a branch I'll send in, and I'll explain the fix there.