Open bubulv opened 8 years ago
Did you adapt jobParallel to fit your compute environment?
Thanks to answer me. Sorry, but I don't know how to adapt jobParallel to fit my cluster. The jobParallel is a matlab file, and it's a function, I don't konw how to call it by myself. Do I need change the matlabString in the simplePBS.m file to my own matlabdir?
In the simplePBS.m file I found I can't get the jobId. I think I can't execute the trainTree.sh file. I don't understand the trainTree.sh file, it's different from my own .sh file. Am I need to change something in the file to fit my cluster? Very thanks you can answer these question.
肖旭
在 2016年2月25日,04:48,Saurabh Gupta notifications@github.com 写道:
Did you adapt jobParallel to fit your compute environment?
— Reply to this email directly or view it on GitHub.
I also have this problem and don't know how to dapt jobParallel to fit my compute environment as @s-gupta has suggested.
I tried editing script_edges.m in "Testing code" section like this:
jobParam = struct('numThreads', 2, 'codeDir', pwd(), 'preamble', '', 'matlabpoolN', 1, 'globalVars', {{}}, 'fHandle', @empty, 'numOutputs', 1);
resourceParam = struct('mem', 3, 'hh', 1, 'numJobs', 50, 'ppn', 2, 'nodes', 3, 'logDir', '/home/hojat/Documents/application/rcnn-depth/results/log/pbsBatchDir/', 'queue', 'psi', 'notif', false, 'username', 'hojat', 'headNode', 'psi');
is it incorrect? and help is greatly appreciated.
jobParallel
sends the parameters to simplePBS
which should run shell script using system
function.
the script runs without error but it actually does nothing!
also at the beginning of th run I get this:
Running ssh psi 'qsub /home/hojat/Documents/application/rcnn-depth/results/log/pbsBatchDir/2017-04-28-15:07:29-edgesEvalImg/edgesEvalImg.sh -t 1-50'
ssh psi 'qsub /home/hojat/Documents/application/rcnn-depth/results/log/pbsBatchDir/2017-04-28-15:07:29-edgesEvalImg/edgesEvalImg.sh -t 1-50' : Signal 127
ssh: /usr/local/MATLAB/R2016b/bin/glnxa64/libcrypto.so.1.0.0: no version information available (required by ssh)
ssh: /usr/local/MATLAB/R2016b/bin/glnxa64/libcrypto.so.1.0.0: no version information available (required by ssh)
OpenSSL version mismatch. Built against 1000207f, you have 100010bf
I don't know if this the reason or not but I don't know of any way to verify if the script is running as it has to! I even tried to run the jobName = 'test_edge_model'; script_edges;
and it was running for more than 36 hours(!) with no apparent result!
I have the same problem.Have you solved the problem? If you have solved the problem ,please tell me the answer. Help is greatly appreciated! @hojat-kaveh @s-gupta @bubulv
when I run 'jobName = 'test_edge_model'; scriptedges;' in matlab the following is like : ***(000 / 000 / 100) *_*****(000 / 000 / 100) ********_(000 / 000 / 100) ****(000 / 000 / 100) *_*****(000 / 000 / 100) ********_(000 / 000 / 100) ****(000 / 000 / 100) *_**(000 / 000 / 100) I find the code is error in 'collectJob.m' file, it always drop out of the 'try catch ' and never stop. '. It can't load(fullfile(jobDir, sprintf('output-%03d.mat', i)), 'jobDone', 'jobMine', 'jobError'), there is no 'output-x.mat' int the jobDir. The same proble when I run 'jobName = 'edges_to_ucms'; script_regions; ' in matlab. Could you tell me why?