dmwm / CRAB2

CRAB2
2 stars 11 forks source link

catch and handle gsissh stale socket #897

Closed ericvaandering closed 10 years ago

ericvaandering commented 10 years ago

Original Savannah ticket 101073 reported by belforte on Tue Apr 2 08:08:28 2013.

e.g. crab: Execute command : gsissh -o ControlMaster=auto -o ControlPath=/tmp/meridian/.ssh/ssh-link-1235429647-submit-4.t2.ucsd.edu -o BatchMode=yes -o StrictHostKeyChecking=no -o ForwardX11=no submit-4.t2.ucsd.edu mkdir -p meridian_Photon_Run2012A-13Jul2012-v1_c93m1k crab: Status,output= 65280,Control socket connect(/tmp/meridian/.ssh/ssh-link-1235429647-submit-4.t2.ucsd.edu): Connection refused ControlSocket /tmp/meridian/.ssh/ssh-link-1235429647-submit-4.t2.ucsd.edu already exists crab: Command: gsissh -o ControlMaster=auto -o ControlPath=/tmp/meridian/.ssh/ssh-link-1235429647-submit-4.t2.ucsd.edu -o BatchMode=yes -o StrictHostKeyChecking=no -o ForwardX11=no submit-4.t2.ucsd.edu mkdir -p meridian_Photon_Run2012A-13Jul2012-v1_c93m1k failed with output= Control socket connect(/tmp/meridian/.ssh/ssh-link-1235429647-submit-4.t2.ucsd.edu): Connection refused ControlSocket /tmp/meridian/.ssh/ssh-link-1235429647-submit-4.t2.ucsd.edu already exists

ericvaandering commented 10 years ago

Comment by belforte on Wed Apr 24 07:28:50 2013

will now detect "already exists" in gsis* output when it fails and remove the control socket /local/reps/CMSSW/COMP/PRODCOMMON/src/python/ProdCommon/BossLite/Scheduler/SchedulerRemoteglidein.py,v <-- SchedulerRemoteglidein.py new revision: 1.26; previous revision: 1.25

this will require a new ProdCommon tag PRODCOMMON_0_12_18_CRAB_54

ericvaandering commented 10 years ago

Comment by belforte on Wed Apr 24 07:39:56 2013

check for gsis* really failed was missing in some cases, fixed

SchedulerRemoteglidein.py new revision: 1.27; previous revision: 1.26

ericvaandering commented 10 years ago

Comment by belforte on Fri May 3 14:06:45 2013

release in client CRAB_2_8_7

ericvaandering commented 10 years ago

Closed by belforte on Fri May 3 14:06:45 2013