ojwoodford / batch_job

Parallelize MATLAB for loops across workers, without the Parallel Computing Toolbox
MIT License
18 stars 6 forks source link

batch_job not on the path, no error from workers #4

Closed jasonnicholson closed 4 years ago

jasonnicholson commented 6 years ago

If you launch batch_job but have not permanently put batch_job on the path, then the workers will error but the exception is not handled appropriately. There is no feedback to the user that this is a problem. Instead, feedback to the user should be handled in a way that the user knows this is the problem.

The problem came up on linux. User permissions prevent the user from permanently adding batch_job to the path. Instead, I have to update my path by adding the batch_job location to a startup.m file. This is a manual process and not automated so I don't always do it and thus inadvertently found this issue.

I tracked the error. The offending line of code in batch_job.m is this:

[status, cmdout] = system(sprintf('%s -automation -nodisplay -r "try, batch_job(''%s'', %d); catch, end; quit();" &', executable, params_file, worker));

Specifically,

try batch_job(''%s'', %d); catch, end; quit();

Is there something we can put into the catch block that would alert the user to a problem? Otherwise, maybe we just list this as a known issue.

ojwoodford commented 6 years ago

Good point.

I think what's needed is a test script that makes sure everything is properly configured. I could potentially add a check to the start of batch_job_test().

spotlightgit commented 4 years ago

Well, such a test script would be nice. Not only the files of this package should be tested (batch_job...), also the function which should be evaluated (goal function) should be tested. This should reduce the Debugging time :-)

ojwoodford commented 4 years ago

The functions are tested. There is a function batch_job_test() which tests the functionality.

But maybe the error checking can be improved on the workers.

ojwoodford commented 4 years ago

Having thought about this, I stuck with my original idea of adding a check to batch_job_test(). This script should be run before using batch_job at all. I'll add a comment to the README.