HenrikBengtsson / R.matlab

R package: R.matlab
https://cran.r-project.org/package=R.matlab
86 stars 25 forks source link

MATLAB server: Use temporary filenames that are less likely to clash #25

Closed HenrikBengtsson closed 8 years ago

HenrikBengtsson commented 9 years ago

Background

I received a report (private email thread on July 28-29, 2015) on the possibility of name clashes of temporary files when multiple MATLAB servers are running. For example, assume we start and connect to two servers on to different ports as:

> library('R.matlab')

# Start two seperate MATLAB servers
> Matlab$startServer(port=9997)
> Matlab$startServer(port=9999)

# Connect to each of them
> matlab1 <- Matlab(port=9997); open(matlab1)
> matlab2 <- Matlab(port=9999); open(matlab2)

We can then evaluate different MATLAB expression on each of them as:

# Evaluate expression in each of them
> evaluate(matlab1, "x=1+2; x")
> evaluate(matlab2, "y=1+2; y")

Potential problem

Next, imaging we scale this up and running 10's or 100's of parallel MATLAB jobs this way. Then there is a risk that temporary files used to send and receive commands/data between R and MATLAB may overwrite/delete each other because the different processes happens to share the same temporary file names.

It could also be that the temporary file names generated by R / R.matlab when running many jobs in parallel clashes.

The symptom for one MATLAB job deleting its temporary file that happens to have the same name as another MATLAB job may look like:


  Unable to read file '/tmp/RtmpGYefwk/file489c7040c5a6.mat': no such file or
  directory.

  Error in MatlabServer (line 322)
      load(filename);

FYI, MatlabServer (line 322) is part of:

  %-------------------
  % 'receive'
  %-------------------
  elseif (state == strmatch('receive', commands, 'exact'))
    filename = char(readUTF(is));
    fprintf(1, 'Will read MAT file: "%s"\n', filename);
    load(filename);
    clear filename;
    writeByte(os, 0);
    state = 0;
  end
end

that that filename is actually created by R.matlab and sent over to MATLAB.

Summary

It's not 100% clear to me why/where the risk of name clashes occur, or if it is due to something else, e.g. R deleting the temporary file before MATLAB had a chance to read it. I leave this issue here for the record, in case someone else experience similar problems.

stefanavey commented 9 years ago

After originally reporting this problem, I was able to solve it by creating a Matlab client object with the remote option set to TRUE to avoid communicating via the filesystem (see help(Matlab))

matlab <- Matlab(port=port, remote=TRUE)

I think the problem was that the tempname function in Matlab (used to create temporary files in Matlabserver.m) does not guarantee a unique name in some cases. I was using mclapply in R and suspect that multiple forked processes got the same name from tempname and one process deleted the file before the other could read it. The error is actually a great thing because otherwise you could imagine a scenario where Matlab writes the file in one process and reads the same file in a different process which would likely lead to incorrect results.

HenrikBengtsson commented 9 years ago

The error is actually a great thing because otherwise you could imagine a scenario where Matlab writes the file in one process and reads the same file in a different process which would likely lead to incorrect results.

Unfortunately, there's still the risk that two MATLAB jobs read from the same file and then the file was delete. I'd admit that the chance for this should be much smaller than the risk for the to work with the same temporary file.

If tempname is not unique enough as you say, then one could add the port number to also be part of the temporary file name. For instance, instead of

    tmpname = sprintf('%s.mat', tempname);

one could use

    tmpname = sprintf('%s_%d.mat', tempname, port);

I'll probably add this soon - it seems harmless to do so.

stefanavey commented 9 years ago

Yes I agree there is still that risk which might not produce an error. I think your proposed enhancement should prevent any clashes when multiple matlab servers are running simultaneously on different ports. When it is added I can test it. Thanks!

HenrikBengtsson commented 9 years ago

I've added this to branch hotfix/tempfiles (R.matlab 3.2.0-1) which is a minimal fork from R.matlab 3.2.0 on CRAN (=what is in the master branch). You can install the fix as:

source("http://callr.org/install#HenrikBengtsson/R.matlab@hotfix/tempfiles")

To go back to 3.2.0, just do:

install.packages("R.matlab")

I'm keen to hear if you observe a difference and if this solves the problem.