Open HenrikBengtsson opened 4 years ago
I think this behaviour is fairly known, and this is why descriptors are used instead in parallel settings.
A cleaner way would be to use active bindings of reference classes as proposed in https://github.com/phaverty/bigmemoryExtras by @phaverty, which I've borrowed in bigstatsr.
I realized my request for a bug fix was not clear. I'm not asking to make it possible to export bigmemory objects. The ask is to detect the problem and give an informative error message rather than core dumping. I've updated my top comment accordingly.
Apparently, {magick} has a way to prevent from this:
library(magick)
tiger <- image_read_svg('http://jeroen.github.io/images/tiger.svg', width = 400)
saveRDS(tiger, tmp <- tempfile(fileext = ".rds"))
readRDS(tmp)
Error: Image pointer is dead. You cannot save or cache image objects between R sessions.
But I'm not sure how they manage to do this.
Maybe @jeroen can give a clue.
You just need to check that your XPtr is not NULL, before using it, everywhere where you use:
Rcpp::XPtr<BigMatrix> pMat(bigMatAddr);
I've just tried checking for bigMatAddr == NULL
, but it does not seem to do anything?
That gives you the address of the SEXP object itself. To get the externalpointer, I think you need pMat.get()
or just *pMat
may work as well.
You can also use BigMatrix* mat = pMat.checked_get()
which is designed exactly for this cause: it will raise an error if the pointer is NULL.
That is very useful information, thanks!
I've tried using BigMatrix * pMat = (as< XPtr<BigMatrix> >(bigMatAddr)).checked_get();
instead of Rcpp::XPtr<BigMatrix> pMat(bigMatAddr);
.
It does not seem to prevent from crashing.
From what I see from {magick}, you do assert_image()
in R, not C++.
I still believe that the active binding is the way to go to solve this problem, as it prevents from having to check every single R function.
As {bigmemory} is not using an RC object (but S4), I'm not sure it can use active bindings. I just wonder if we could just wrap the externalptr class within another one doing the protection.
All that the assert_image function in magick does is call magick_image_dead which checks if the pointer is NULL.
Hi, I discovered that bigmemory core dumps (=crashes/terminates) parallel workers if attempting to use 'big.memory' objects. This appears to be because there is an assumption that the object is always used in the same R process that it was created in, which does not work because of the external pointer. Here is a minimal reproducible example:
In one R session, do:
In another R session, do:
Suggestion
Instead of core dumping, detect the problem and give an informative error message:
I don't know the internals, but I assume the problem is that the external pointer:
is used without making sure it is still valid.
PS. I consider this a quite serious bug since it can core dump R and parallel workers in R and it's hard to protect against it. People who run parallel code might not even know that bigmemory is used as part of some other package they rely on. This is the first package that I know of that use external pointers that also core dumps R, cf. https://cran.r-project.org/web/packages/future/vignettes/future-4-non-exportable-objects.html. It looks like those other packages can detect the problem and prevent core dumping, so, hopefully, it is not to hard to protect against this.
EDIT 2020-08-15 @ 18:11 UTC: Clarified that the bug fix should be to give an informative error message instead of core dumping.