andersonwinkler / PALM

PALM: Permutation Analysis of Linear Models
70 stars 28 forks source link

Correct issue with MATLAB vs IS_CLUSTER #37

Closed scratchings closed 3 years ago

scratchings commented 3 years ago

MATLAB script files should be passed into MATLAB via STDIN, e.g. matlab < myscript.m

Script is modified to use the correct syntax based on the value of IS_CLUSTER.

andersonwinkler commented 3 years ago

Thank you Duncan. Do you know if reading from stdin will work if 'matlab' is a script that in turn invokes the real matlab? Cheers, Anderson

gllmflndn commented 3 years ago

I would be mindful that the STDIN option is, I think, undocumented, and, in my experience, there is a maximal script size after which the content is cut. I don't know what the difference between MATLAB and IS_CLUSTER is here though. I tend to use the -batch "statement" option.

scratchings commented 3 years ago

The STDIN option is actually documented here: https://uk.mathworks.com/help/matlab/matlab_env/start-matlab-on-linux-platforms.html

in the section about remote SSH launching of MATLAB.

We use the STDIN option extensively on our cluster (typically with fsl_sub) and have yet to experience any issues with script length, certainly not with the size of script generated in the temporary file. STDIN shouldn’t have any limits (as per POSIX) – we use it to transfer TBs of data through pipelines - so any limits would be MATLAB imposed.

From reading the help text for the ‘matlab’ command, the -batch option can take script files, but is fairly dumb in that it only takes the name of the script, sans extension and path, so theoretically if your temp file was /tmp/TEMPFILENAME.m, you could do:

(cd /tmp; matlab -batch TEMPFILENAME)

However this won’t work as the addpath(‘.’) then adds the wrong folder to MATLAB's search path and it all falls over.

I am curious as to why the IS_CLUSTER option exists at all.

andersonwinkler commented 3 years ago

Thank you guys.

The IS_CLUSTER was introduced because a user once informed that they couldn't run in their environment because RUNCMD could contain characters that their system wouldn't parse. Enveloping into a file solved the issue.

From the links you sent it seems that "-r" has been replaced by "-batch" from R2019a but it isn't a perfect replacement as the path issue indicates. The page says that we can use "-sd" to specify the startup directory but that isn't good either because the user may supply files with paths relative to their current folder, which won't be /tmp, then those input files won't be found. Unless we store the temporary file in the folder indicated by PALM's "-o" option, but this would require parsing the options too early. Same if we attempt to replace relative paths for absolute: requires parsing options too soon.

I'm not happy with stdin either: I fear that if "matlab" is a script (wasn't it like this in jalapeno back then?) then if such script doesn't handle stdin well, it will all crash, whereas if we give explicit options (as "-r", "-batch", "-sd") we'd be more defensive against such issues. But then, these aren't perfect either.

Interestingly, we can't apparently test for the Matlab version (so we could retain "-r" for old versions) without using an m-file that will call "version", so we'd face the same problem even to figure out which Matlab version we are using.

All and all, maybe we'll have to use stdin and, if "matlab" is a script, then the local sysadmin will have to make sure it handles stdin.

Ok, I'll merge.

Thanks Guillaume! Thanks Duncan!

Cheers,

Anderson

gllmflndn commented 3 years ago

Thanks @scratchings - I searched back when I had issues and it was with R2016a (standard input redirection was truncated). The answer from MathWorks support was:

The stdin redirection feature is not documented and also not supported. So it can change without notice. Unfortunately we cannot provide support for undocumented features.

Things might have changed in the meantime!