Glast / GlastDIRAC

DIRAC extension for GLAST/Fermi-LAT
1 stars 4 forks source link

gridAccess::stageSet.finish() returns zero RC #111

Closed zimmerst closed 10 years ago

zimmerst commented 10 years ago

apparently the return code of gridAccess::stageSet.finish() is zero even though failures are reported by the underlying API code:

%INFO: 20140507:12:11:28 - runMonteCarlo()/line-640 - Calling stageSet.finish()... SRM2Storage.exists: Failed to get path metadata. srm://lptadpmsv.msfg.fr:8446/srm/managerv2?SFN=/dpm/msfg.fr/home/glast.org/user/z/zimmer/ServiceChallenge/allProton-GR-v20r09p06-OVL6p2/10092: [SE][Ls][SRM_FAILURE] send2nsd: NS009 - fatal configuration error: Host unknown: lptadpmsv.msfg.fr createDirectory: Failed to create directory on storage. srm://lptadpmsv.msfg.fr:8446/srm/managerv2?SFN=/dpm/msfg.fr/home/glast.org/user/z/zimmer/ServiceChallenge/allProton-GR-v20r09p06-OVL6p2/10092: SRM2Storage.exists: Failed to get path metadata. [SE][Ls][SRM_FAILURE] send2nsd: NS009 - fatal configuration error: Host unknown: lptadpmsv.msfg.fr putAndRegister: Failed to put file to Storage Element. /var/torque/tmpdir/5541.lptace01.msfg.fr/CREAM774033653/DIRAC_5KfExwpilot/10281291/allProton-GR-v20r09p06-OVL6p2-010092-merit.root: SRM2Storage.exists: Failed to get path metadata. [SE][Ls][SRM_FAILURE] send2nsd: NS009 - fatal configuration error: Host unknown: lptadpmsv.msfg.fr __putFile: Failed to put file to storage. srm://lptadpmsv.msfg.fr:8446/srm/managerv2?SFN=/dpm/msfg.fr/home/glast.org/user/z/zimmer/ServiceChallenge/allProton-GR-v20r09p06-OVL6p2/10092/dirac_directory.1399457555.07: Invalid argument

removeFile: Failed to remove file. srm://lptadpmsv.msfg.fr:8446/srm/managerv2?SFN=/dpm/msfg.fr/home/glast.org/user/z/zimmer/ServiceChallenge/allProton-GR-v20r09p06-OVL6p2/10092/dirac_directory.1399457555.07: [SE][srmRm][SRM_FAILURE] removeFile: Failed to remove file. srm://lptadpmsv.msfg.fr:8446/srm/managerv2?SFN=/dpm/msfg.fr/home/glast.org/user/z/zimmer/ServiceChallenge/allProton-GR-v20r09p06-OVL6p2/10092/dirac_directory.1399457555.07: [SE][srmRm][SRM_AUTHORIZATION_FAILURE] createDirectory: Failed to create directory on storage. srm://lptadpmsv.msfg.fr:8446/srm/managerv2?SFN=/dpm/msfg.fr/home/glast.org/user/z/zimmer/ServiceChallenge/allProton-GR-v20r09p06-OVL6p2/10092: __putFile: Failed to put file to storage. putAndRegister: Failed to put file to Storage Element. /var/torque/tmpdir/5541.lptace01.msfg.fr/CREAM774033653/DIRAC_5KfExwpilot/10281291/allProton-GR-v20r09p06-OVL6p2-010092-merit.root: __putFile: Failed to put file to storage. total 996 -rw-r--r-- 1 glarvu060 glast.org 251506 May 7 12:14 DIRAC_wrapper.txt -rw-r--r-- 1 glarvu060 glast.org 362 May 7 12:11 MeritTuple_cache.root drwxr-xr-x 2 glarvu060 glast.org 4096 May 7 11:44 Pass8_Analysis_Open.d -rw-r--r-- 1 glarvu060 glast.org 264689 May 7 12:11 allProton-GR-v20r09p06-OVL6p2-010092-merit.root -rw-r--r-- 1 glarvu060 glast.org 149 May 7 12:11 checksum.txt drwxrwxr-x 4 glarvu060 glast.org 4096 May 7 11:44 config -rw-rw-r-- 1 glarvu060 glast.org 2227 Apr 29 14:02 config.py -rw-r--r-- 1 glarvu060 glast.org 841 May 7 11:44 config.pyc -rw-rw-r-- 1 glarvu060 glast.org 31419 May 7 11:38 config.tar.gz -rwxr-xr-x 1 glarvu060 glast.org 3130 May 7 11:44 dirac-glast-pipeline-wrapper.sh -rw-r--r-- 1 glarvu060 glast.org 21 May 7 12:11 eventId.txt drwxrwxr-x 2 glarvu060 glast.org 4096 May 7 11:44 irfs -rw-rw-r-- 1 glarvu060 glast.org 9345 May 7 11:38 irfs.tar.gz -rw-r--r-- 1 glarvu060 glast.org 8201 May 7 11:44 job.info -rwxrwxr-x 1 glarvu060 glast.org 11444 May 7 11:38 jobDescription.xml -rw-rw-r-- 1 glarvu060 glast.org 145 May 7 11:44 jobmeta.inf -rw-r--r-- 1 glarvu060 glast.org 251158 May 7 12:14 logFile.txt -rwxrwxr-x 1 glarvu060 glast.org 5204 Apr 29 13:27 mcRegistration.py -rw-rw---- 1 glarvu060 glast.org 4254 Apr 29 13:27 merit.jo -rw-r--r-- 1 glarvu060 glast.org 48 May 7 11:44 outFiles.list -rw-r--r-- 1 glarvu060 glast.org 14076 May 7 11:44 pipeline_env -rw-r--r-- 1 glarvu060 glast.org 200 May 7 11:44 pipeline_summary -rw-rw---- 1 glarvu060 glast.org 4256 Apr 29 13:27 recon_relation_mc_digi_merit.jo -rwxrwxr-x 1 glarvu060 glast.org 26010 Apr 29 13:27 runMonteCarlo.py -rwxrwxr-x 1 glarvu060 glast.org 567 Apr 29 13:27 runSvac.sh -rwxrwxr-x 1 glarvu060 glast.org 1449 Apr 29 13:27 runWrapper.sh -rwxr-xr-x 1 glarvu060 glast.org 359 May 7 11:38 script -rw-r--r-- 1 glarvu060 glast.org 101 May 7 12:11 source_info.txt -rw-rw-r-- 1 glarvu060 glast.org 7231 Apr 30 12:40 taskConfig.xml -rwxrwxr-x 1 glarvu060 glast.org 8274 Apr 29 13:27 transfer2SLAC.py drwxrwxr-x 2 glarvu060 glast.org 4096 May 7 11:44 xml -rw-rw-r-- 1 glarvu060 glast.org 2638 May 7 11:38 xml.tar.gz total 996 -rw-r--r-- 1 glarvu060 glast.org 251506 May 7 12:14 DIRAC_wrapper.txt -rw-r--r-- 1 glarvu060 glast.org 362 May 7 12:11 MeritTuple_cache.root drwxr-xr-x 2 glarvu060 glast.org 4096 May 7 11:44 Pass8_Analysis_Open.d -rw-r--r-- 1 glarvu060 glast.org 264689 May 7 12:11 allProton-GR-v20r09p06-OVL6p2-010092-merit.root -rw-r--r-- 1 glarvu060 glast.org 149 May 7 12:11 checksum.txt drwxrwxr-x 4 glarvu060 glast.org 4096 May 7 11:44 config -rw-rw-r-- 1 glarvu060 glast.org 2227 Apr 29 14:02 config.py -rw-r--r-- 1 glarvu060 glast.org 841 May 7 11:44 config.pyc -rw-rw-r-- 1 glarvu060 glast.org 31419 May 7 11:38 config.tar.gz -rwxr-xr-x 1 glarvu060 glast.org 3130 May 7 11:44 dirac-glast-pipeline-wrapper.sh -rw-r--r-- 1 glarvu060 glast.org 21 May 7 12:11 eventId.txt drwxrwxr-x 2 glarvu060 glast.org 4096 May 7 11:44 irfs -rw-rw-r-- 1 glarvu060 glast.org 9345 May 7 11:38 irfs.tar.gz -rw-r--r-- 1 glarvu060 glast.org 8201 May 7 11:44 job.info -rwxrwxr-x 1 glarvu060 glast.org 11444 May 7 11:38 jobDescription.xml -rw-rw-r-- 1 glarvu060 glast.org 145 May 7 11:44 jobmeta.inf -rw-r--r-- 1 glarvu060 glast.org 251158 May 7 12:14 logFile.txt -rwxrwxr-x 1 glarvu060 glast.org 5204 Apr 29 13:27 mcRegistration.py -rw-rw---- 1 glarvu060 glast.org 4254 Apr 29 13:27 merit.jo -rw-r--r-- 1 glarvu060 glast.org 48 May 7 11:44 outFiles.list -rw-r--r-- 1 glarvu060 glast.org 14076 May 7 11:44 pipeline_env -rw-r--r-- 1 glarvu060 glast.org 200 May 7 11:44 pipeline_summary -rw-rw---- 1 glarvu060 glast.org 4256 Apr 29 13:27 recon_relation_mc_digi_merit.jo -rwxrwxr-x 1 glarvu060 glast.org 26010 Apr 29 13:27 runMonteCarlo.py -rwxrwxr-x 1 glarvu060 glast.org 567 Apr 29 13:27 runSvac.sh -rwxrwxr-x 1 glarvu060 glast.org 1449 Apr 29 13:27 runWrapper.sh -rwxr-xr-x 1 glarvu060 glast.org 359 May 7 11:38 script -rw-r--r-- 1 glarvu060 glast.org 101 May 7 12:11 source_info.txt -rw-rw-r-- 1 glarvu060 glast.org 7231 Apr 30 12:40 taskConfig.xml -rwxrwxr-x 1 glarvu060 glast.org 8274 Apr 29 13:27 transfer2SLAC.py drwxrwxr-x 2 glarvu060 glast.org 4096 May 7 11:44 xml -rw-rw-r-- 1 glarvu060 glast.org 2638 May 7 11:38 xml.tar.gz %INFO: 20140507:12:15:26 - runMonteCarlo()/line-642 - Return from stageSet.finish = 0

sposs commented 10 years ago

A quick look at the code makes me wonder: what about using FailoverTransfer? This uses Requests to delegate the copy of the file to a later point, in case the destination SE is not available. In between, the file is stored on a temp SE, which is cleaned when the file reached its final destination. There isn't much to do on the Client side, just use the FailoverTransfer client (located in DIRAC/DataManagementSystem/Client/)

zimmerst commented 10 years ago

Excellent suggestion. The main reason for the code being as it is, is the fact that Vincent wrote that a long time ago with Luisa's help - and of course you know the DIRAC tools much better - so simple answer is that I guess we didn't know about the FailoverTransfer client. When I delegated the development my target was simple: i wanted to duplicate the interface to existing pipeline 'staging' code (note that's generically used in our code, not just for tapes).

zimmerst commented 10 years ago

This issue appears to be fixed in the v1r3p14