dmwm / CRAB2

CRAB2
2 stars 11 forks source link

beware FJR's with no lumi but exit code=0 #1106

Closed belforte closed 10 years ago

belforte commented 10 years ago

It is possible :-( that cmsRun ends with exit code= 0 , yet the FJR has no record of read run and lumi in the output file. This goes on undetected until crab publication code fails with a python exception when trying to manipulate the lumi list: File "/afs/cern.ch/work/b/belforte/MY_CRAB2/python/crab_dbs3publish.py", line 36, in format_file_3 for run, lumis in file['runlumi'].items(): AttributeError: 'str' object has no attribute 'items'

See: https://hypernews.cern.ch/HyperNews/CMS/get/crabFeedback/7491.html

Need to catch this early and treat the job as failed.

belforte commented 10 years ago

All considered I will look for this in Publisher.py and avoid adding the fjr to the good_list. I.e. leave the file with exit code = 0 but do not publish it. The alternative would be to catch it on the WN and force the job to have an error exit code, but I fear it may cause even more problems and all in all we do not check FJR's on the WN anyhow, all checks are on client side in Crab2.