Closed tamasgal closed 4 years ago
if your SGE cluster supports qrsh as well as qsub, you might try the undocumented QRSHManager instead.
addprocs_qrsh(...
on high performance file systems which agressively buffer file I/O, interprocess communication via TCP is much more reliable.
Thanks, I tried but no success:
julia> addprocs_qrsh(5)
got no response from JSV script "/opt/sge/util/resources/jsv/corebinding.jsv"
got no response from JSV script "/opt/sge/util/resources/jsv/corebinding.jsv"
got no response from JSV script "/opt/sge/util/resources/jsv/corebinding.jsv"
got no response from JSV script "/opt/sge/util/resources/jsv/corebinding.jsv"
got no response from JSV script "/opt/sge/util/resources/jsv/corebinding.jsv"
Too old to reproduce. We've released a new version of the package, please report any issues there if they still apply.
Thanks, I'll try!
I hope someone can help me. I am trying to do parallel computing with Julia on our SGE grid system, which I normally only fed with shell scripts.
When I run for example
addprocs_sge(5,res_list="ct=00:01:00")
, it immediately shows the received job id and is waiting for the job to start. A few seconds after however, I receive error messages which indicates that it can'ttail
the log files in my home, which however are present:This is the content of one of the log files, so it apparently fails to run the julia process, which however is of course accessible (I am using the same binary for the REPL):
Any ideas what's happening here?