Closed ericvaandering closed 10 years ago
Comment by belforte on Tue Aug 20 10:00:22 2013
I saw some user MC tasks (Sherpa) where exe was ok, stageout OK and yet exit code was 70500, am wondering if it is same issue and thus better fix it now.
Comment by belforte on Thu Aug 29 10:07:55 2013
the problem in Sherpa MC I found seems different and likely user did not configure script_exe correctly, like did not produce fjr file. Error is Error: fjr file does not contain enough information from stdout:
>>> Modify Job Report: CMSSW_VERSION = CMSSW_5_3_11 /pool/grid/cmsplt01/home_cream_216163927/CREAM216163927/glide_a13020/execute/dir_17986/ProdCommon/FwkJobRep/ModifyJobReport.py fjr /pool/grid/cmsplt01/home_cream_216163927/CREAM216163927/glide_a13020/execute/dir_17986/crab_fjr_3.xml json /pool/grid/cmsplt01/home_cream_216163927/CREAM216163927/glide_a13020/execute/dir_17986/resultCopyFile n_job 3_1_WSA PrimaryDataset null ApplicationFamily MCDataTier ApplicationName sh cmssw_version CMSSW_5_3_11 psethash ea119f275648ec12df6ff4345bde5eef inputDict = {u'/pool/grid/cmsplt01/home_cream_216163927/CREAM216163927/glide_a13020/execute/dir_17986/CMSSW_5_3_11/job_3_1_WSA.log': {u'endpoint': u'srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms//store/group/phys_smp/smp-gen/nlops/sherpa/Z2jetNLO4jetLO_MPI_crab_seed2_2001/', u'surl_for_grid': u'', u'reason': u'Copy succedeed with srm-lcg utils', u'erCode': u'0', u'se_name': u'srm-eoscms.cern.ch', u'for_lfn': u'/store/group/phys_smp/smp-gen/nlops/sherpa/Z2jetNLO4jetLO_MPI_crab_seed2_2001/'}, u'/pool/grid/cmsplt01/home_cream_216163927/CREAM216163927/glide_a13020/execute/dir_17986/CMSSW_5_3_11/gentar_3_1_WSA.tgz': {u'endpoint': u'srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms//store/group/phys_smp/smp-gen/nlops/sherpa/Z2jetNLO4jetLO_MPI_crab_seed2_2001/', u'surl_for_grid': u'', u'reason': u'Copy succedeed with srm-lcg utils', u'erCode': u'0', u'se_name': u'srm-eoscms.cern.ch', u'for_lfn': u'/store/group/phys_smp/smp-gen/nlops/sherpa/Z2jetNLO4jetLO_MPI_crab_seed2_2001/'}, u'/pool/grid/cmsplt01/home_cream_216163927/CREAM216163927/glide_a13020/execute/dir_17986/CMSSW_5_3_11/ntuple_3_1_WSA.root': {u'endpoint': u'srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms//store/group/phys_smp/smp-gen/nlops/sherpa/Z2jetNLO4jetLO_MPI_crab_seed2_2001/', u'surl_for_grid': u'', u'reason': u'Copy succedeed with srm-lcg utils', u'erCode': u'0', u'se_name': u'srm-eoscms.cern.ch', u'for_lfn': u'/store/group/phys_smp/smp-gen/nlops/sherpa/Z2jetNLO4jetLO_MPI_crab_seed2_2001/'}} Error: fjr file does not contain enough information ModifyReportResult=70500 WARNING: Problem with ModifyJobReport
Comment by belforte on Mon Sep 9 07:46:36 2013
PSETHASH was set to null if not defined, but not when pset=none, I moved that lines out of the "if pset"
/local/reps/CMSSW/COMP/CRAB/python/cms_cmssw.py,v <-- cms_cmssw.py new revision: 1.399; previous revision: 1.398
fix released in CRAB_2_9_1
Original Savannah ticket 95570 reported by belforte on Thu Jun 21 13:03:17 2012.
Dear crab developers,
this problem is not urgent, I just wanted to bring it to your attention.
I have a rather unorthodox crab.cfg, which you can find here: /afs/cern.ch/user/j/joshmt/public/crab.cfg
In particular it has: pset = none Also, I am running an arbitrary script using the script_exe argument. I run cmsRun inside my job in order to generate an fjr xml file.
I found that jobs were exiting with code 70500 due to a problem with ModifyJobReport.py (see full error below *). This seems to be because the CMSSW.sh file generated by crab -create is running the following command:
$RUNTIME_AREA/ProdCommon/FwkJobRep/ModifyJobReport.py fjr $RUNTIME_AREA/crabfjr$NJob.xml json $RUNTIME_AREA/resultCopyFile n_job $OutUniqueID PrimaryDataset $PrimaryDataset ApplicationFamily $ApplicationFamily ApplicationName $executable cmssw_version $CMSSW_VERSION psethash $PSETHASH
but the variable PSETHASH has never been defined.
I managed to fix this problem by editing CMSSW.sh after running crab -create to add the line export PSETHASH=null
Now my jobs terminate with exit code 0 as they should. So I don't need any help, but this seems like a bug with a (rarely used?) piece of crab.
regards, josh
(*) >>> Modify Job Report: CMSSW_VERSION = CMSSW_5_2_5 /home/uscms69/gram_scratch_d2438ru2HM/https_3a_2f_2fwms015.cnaf.infn.it_3a9000_2fI24d7yPbARvfDv-1YzWTXw/ProdCommon/FwkJobRep/ModifyJobReport.py fjr /home/uscms69/gram_scratch_d2438ru2HM/https_3a_2f_2fwms015.cnaf.infn.it_3a9000_2fI24d7yPbARvfDv-1YzWTXw/crab_fjr_1.xml json /home/uscms69/ gram_scratch_d2438ru2HM/https_3a_2f_2fwms015.cnaf.infn.it_3a9000_2fI24d7yPbARvfDv-1YzWTXw/resultCopyFile n_job 1_1_pAR PrimaryDataset null App licationFamily MCDataTier ApplicationName sh cmssw_version CMSSW_5_2_5 psethash Traceback (most recent call last): File "/home/uscms69/gram_scratch_d2438ru2HM/https_3a_2f_2fwms015.cnaf.infn.it_3a9000_2fI24d7yPbARvfDv-1YzWTXw/ProdCommon/FwkJobRep/ModifyJobR eport.py", line 143, in <module> diz[L[i]] = L[i+1] IndexError: list index out of range ModifyReportResult=70500
Visit this CMS message (to reply or unsubscribe) at: https://hypernews.cern.ch/HyperNews/CMS/get/crabFeedback/5817.html