eddelbuettel / rcppcnpy

Rcpp bindings for NumPy files
GNU General Public License v2.0
26 stars 16 forks source link

RcppCNPy::npySave will make R crash if the folder of the `filename` does not exist #14

Closed wush978 closed 6 years ago

wush978 commented 6 years ago

Hi Dirk,

On my machine, this bug interrupts my R session many times, so could we add a check before writing the file?

Best, Wush

eddelbuettel commented 6 years ago

Sure.

Doesn't seem trivial though as filesystem operations are not portable, see SO question.

eddelbuettel commented 6 years ago

Does not look good. We could rely on Boost for something like

void checkFilename(std::string filename) {
    boost::filesystem::path path = boost::filesystem::path(filename.c_str());
    boost::filesystem::path parentpath = path.parent_path();
    Rcpp::Rcout << "Seeing parent " << parentpath << std::endl;
    return;
}

but this requires linking against -lboost_system -lboost_filesystem so we cannot do it via BH.

So in short: not sure we can.

We could add an R function as a helper so that you would do npySave(checkFilename(filename), object). Good enough?

wush978 commented 6 years ago

How about directly call R function in Rcpp? And add an argument checkdir = TRUE to let the user enable/disable the checking so they could avoid the overhead of calling R function if they want.

To check the existence, we can call base::basename and base::dir.exists.

eddelbuettel commented 6 years ago

Yes, I had the same thought. And the extra argument is a good idea too.

eddelbuettel commented 6 years ago

I put something together on the commute in which I'll send out as PR when I get; please have a look at that. It basically just error(0's out if the directory does not exist avoiding the later crash you may have seen.

eddelbuettel commented 6 years ago

This is now #15 -- give it a spin. In a nutshell:

R> library(RcppCNPy)
R> npySave("/tmp/foo.npy", matrix(1:9, 3, 3))
R> npySave("/tmp/abc/def/ghi/foo.npy", matrix(1:9, 3, 3), checkPath=TRUE)
Error in npySave("/tmp/abc/def/ghi/foo.npy", matrix(1:9, 3, 3), checkPath = TRUE) : 
  Evaluation error: Directory '/tmp/abc/def/ghi' does not exist.
R> 
wush978 commented 6 years ago

It looks good.

By the way, shall we enable checkPath by default? For me, avoiding crashing R is more important than efficiency and the checking should not break the compatibility.

eddelbuettel commented 6 years ago

Maybe, maybe not. Let me think about it for a second. There is a school of thought that leaves some responsibility with the user (ie unlink() does not ask "do you really mean to remove the file") but checking assumptions is fair too.

But "probably". I'll deal with it.

eddelbuettel commented 6 years ago

Check now -- I dropped two more commits onto this.

eddelbuettel commented 6 years ago

Any more comment, or is #15 good to go as it is now?