r-lib / cpp11

cpp11 helps you to interact with R objects using C++ code.
https://cpp11.r-lib.org/
Other
193 stars 46 forks source link

`unwind_protect()` global option is falsely assumed to be immutable #325

Closed DavisVaughan closed 11 months ago

DavisVaughan commented 11 months ago

Fixing https://github.com/r-lib/cpp11/pull/299, which ensures that we are always referencing the R global option cpp11_should_unwind_protect to determine if we need to unwind-protect or not, revealed a big issue with how this is currently set up.

CRAN emailed us about this, and actually reverted cpp11 0.4.4 back to 0.4.3 code (but bumping the version to 0.4.5) because this was such a problem.

An easy way to demonstrate the issue is to install cpp11 0.4.4 and run devtools::check() on the hdpGLM package (there are many others that fail with 0.4.4, including FIESTA MIMSunit amt arealDB campsis ggPMX hdpGLM mpathsenser tidyfit). You should get recursive gc issues and R will eventually crash.


The core problem is that cpp11 expects that the cpp11_should_unwind_protect global option never changes. This is because we take a static pointer to its underlying logical(1) data, which we modify directly from C++ as needed.

This seems reasonable, but many package authors do this:

opt <- options()
on.exit(options(opt))

which does create a copy of every individual option value, including cpp11_should_unwind_protect, invalidating our pointer.

In particular it is done here for hdpGLM, which then calls tidyr::gather(), which uses C++ code written with cpp11, and that is always roughly where the crash occurs: https://github.com/DiogoFerrari/hdpGLM/blob/4ba7b7c88a9a2bea344f13ae373f84b48a1d5a62/R/src_summaries.R#L1546-L1547


A possible fix for this specific issue that Kevin suggested is to instead use an environment in the global option, because those won't be copied, and then put our cpp11 specific values in that environment.

One thing we also discussed is whether changing this will affect other already built packages that linked against "old" cpp11. Those packages will be looking for and modifying cpp11_should_unwind_protect, while any packages built with "new" cpp11 will be looking for cpp11_environment (or whatever we call it). This isn't ideal, but might be okay. At the absolute worst case the package would need to be reinstalled with "new" cpp11.

We could also consider versioning the slots inside the environment based on the current cpp11 version.

So it may look something like:

cpp11_environment <- new.env()

cpp11_environment_0_4_6 <- new.env()
cpp11_environment_0_4_6[["should_unwind_protect"]] <- TRUE

cpp11_environment[["0.4.6"]] <- cpp11_environment_0_4_6

options(cpp11_environment = cpp11_environment)

It is also possible that we don't want to do that, however. The following cpp11 code can crash if should_unwind_protect isn't set up correctly:

[[cpp11::register]]
void test() {
  cpp11::unwind_protect([&] {
    cpp11::unwind_protect([&] {
      Rf_error("oh no!");
    });
  });
}

If test() is called from R:

It is technically possible to be in a scenario where the outer unwind_protect() comes from a package built with one version of cpp11 and the inner unwind_protect() comes from a package built with another version of cpp11. If both calls think they should unwind protect (because they are looking at different versions of should_unwind_protect), then we crash.

DavisVaughan commented 11 months ago

I believe this is actually the cause of the insanely odd bug https://github.com/r-lib/cpp11/issues/244

So the "fix" here https://github.com/r-lib/cpp11/commit/9a62c3a420b076d6ad48c291039421ff739dcc58 only made the problem disappear because we went from using a reference to the global option to using a copy of it through:

static auto should_unwind_protect = detail::get_should_unwind_protect();

since the way to use a reference would have been

static auto& should_unwind_protect = detail::get_should_unwind_protect();

and changing to that will indeed cause this kind of crash again