gvegayon / parallel

PARALLEL: Stata module for parallel computing
https://rawgit.com/gvegayon/parallel/master/ado/parallel.html
MIT License
117 stars 26 forks source link

Working Directory needs to be in S_ADO if using saved Mata functions #31

Closed bquistorff closed 8 years ago

bquistorff commented 8 years ago

When saving mata functions, parallel makes an mlib file in the current working directory. If "." is not part of the ado-path then this file won't be found by the child processes. See this example program

clear all

//get parallel
sysdir set PERSONAL "code/ado" //change this for your own system
global S_ADO "BASE;PERSONAL;."
mata: mata mlib index
parallel setclusters 2

//define the dummy task
mata:
void dummy_func(){
    printf("Hi\n")
}
end
program dummy_prog
    mata: dummy_func()
end

parallel, keep mata programs(dummy_prog): dummy_prog
_assert r(pll_errs)==0 //works

global S_ADO "BASE;PERSONAL"
parallel, keep  mata programs(dummy_prog): dummy_prog
_assert r(pll_errs)==2 //doesn't work

The user can work around the error by adding "." to the ado-path (either S_ADO or with an include() file) or moving the mata code to an mlib.

I can't think of a perfect solution on the package side. Any changing of the ado-path might cause the programs to work differently (they might have ados in "." that override other programs) or the user might change S_ADO in the child process negating the fix. I suggest that we issue a warning and let the user deal with it. This situation is likely not very common so it would be little burden to users. We can check for a likely error by doing cap which l__pllparallelid'_mlib.mliband checking_rc(the user could work around the issue with aninclude()` or changing S_ADO in their child process so it's no necessarily an error).

gvegayon commented 8 years ago

What about adding the dir option here:

https://github.com/gvegayon/parallel/blob/1b7a7f493f520a389516f35282d805b134ab62bd/ado/parallel.ado#L287-L288

We can explicitly save it in PERSONAL and then remove it at the end. Other option is to modify the S_ADO global by adding the "." and, if it wasn't there before, remove it.

What do you think?

bquistorff commented 8 years ago

I don't really want to put files in non-obvious places. If parallel has errors, then temporary files could accumulate. At least if this happens in "." it is obvious to the user.

The best to probably change S_ADO to pickup the new mlib (as you suggested second). I'll do that.

bquistorff commented 8 years ago

Implemented in 1256419.