Open soichih opened 4 years ago
Another test failed.
*******************************Held jobs' details*******************************
==========================autorecon1_sh_subject_00001===========================
submit file : autorecon1_sh_subject_00001.sub
last_job_instance_id : 9
reason : Error from slot1_2@condor-worker-7c7d97844f-m2hrq@river-c048.ssl-hep.org: STARTER at 192.168.8.30 failed to send file(s) to <192.170.227.166:9618>: error reading from /var/lib/condor/execute/dir_3291/subject_recon1_output.tar.xz: (errno 2) No such file or directory; SHADOW failed to receive file(s) from <192.170.236.148:34378>
******************************Failed jobs' details******************************
==========================autorecon1_sh_subject_00001===========================
last state: POST_SCRIPT_FAILED
site: condorpool
submit file: 00/00/autorecon1_sh_subject_00001.sub
output file: 00/00/autorecon1_sh_subject_00001.out.002
error file: 00/00/autorecon1_sh_subject_00001.err.002
-------------------------------Task #1 - Summary--------------------------------
site : condorpool
hostname : -
executable : /public/hayashis/workdir/5ea9b8ff4623dab009eddf97/5eab20644623da3b9fee55eb/work/00/00/autorecon1_sh_subject_00001.sh
arguments : -
exitcode : -1
working dir : /public/hayashis/workdir/5ea9b8ff4623dab009eddf97/5eab20644623da3b9fee55eb/work
Here is another instance of this error.
*******************************Held jobs' details*******************************
===========================autorecon1_sh_output_00001===========================
submit file : autorecon1_sh_output_00001.sub
last_job_instance_id : 5
reason : Error from slot1_5@glidein_19785_659649776@lnxfarm338.colorado.edu: STARTER at 192.168.4.138 failed to send file(s) to <192.170.227.166:9618>; SHADOW at 192.170.227.166 failed to write to file /public/hayashis/scratch/work/output_recon1_output.tar.xz: (errno 2) No such file or directory
******************************Failed jobs' details******************************
=========================autorecon2_sh_output-rh_00002==========================
last state: POST_SCRIPT_FAILED
site: condorpool
submit file: 00/00/autorecon2_sh_output-rh_00002.sub
output file: 00/00/autorecon2_sh_output-rh_00002.out
error file: 00/00/autorecon2_sh_output-rh_00002.err
-------------------------------Task #1 - Summary--------------------------------
site : condorpool
hostname : -
executable : /public/hayashis/workdir/5ea9b8ff4623dab009eddf97/5eadfd754623da2032ef1e58/work/00/00/autorecon2_sh_output-rh_00002.sh
arguments : -
exitcode : -1
working dir : /public/hayashis/workdir/5ea9b8ff4623dab009eddf97/5eadfd754623da2032ef1e58/work
-----------Job stderr file - 00/00/autorecon2_sh_output-rh_00002.err------------
Job submission failed because of HTCondor event SUBMIT_FAILED
=========================autorecon2_sh_output-lh_00003==========================
last state: POST_SCRIPT_FAILED
site: condorpool
submit file: 00/00/autorecon2_sh_output-lh_00003.sub
output file: 00/00/autorecon2_sh_output-lh_00003.out
error file: 00/00/autorecon2_sh_output-lh_00003.err
-------------------------------Task #1 - Summary--------------------------------
site : condorpool
hostname : -
executable : /public/hayashis/workdir/5ea9b8ff4623dab009eddf97/5eadfd754623da2032ef1e58/work/00/00/autorecon2_sh_output-lh_00003.sh
arguments : -
exitcode : -1
working dir : /public/hayashis/workdir/5ea9b8ff4623dab009eddf97/5eadfd754623da2032ef1e58/work
-----------Job stderr file - 00/00/autorecon2_sh_output-lh_00003.err------------
Job submission failed because of HTCondor event SUBMIT_FAILED
I was able to run the test job successfully, and obtained what seems to be a valid freesurfer output.
However, I ran another test job using the same t1 input, and this time it failed with this error message.
How should I handle this error?