dmwm / CRAB2

CRAB2
2 stars 11 forks source link

illegal dataset name created #1049

Closed belforte closed 10 years ago

belforte commented 10 years ago

see Stefano Belforte

11:09 AM (1 hour ago)

to hn-cms-crabFee. *\ Discussion title: CRAB Feedback

thanks Lucia. Indeed also the log where you managed to publish shows the same error: 2014-03-06 17:17:56,593 [INFO] Error when listing files in DBSHTTP Error 400: Invalid Input Data /higgsToZA...: Not Match Required FormatTraceback (most recent call last):

And I see a very odd dataset name: /higgsToZAToTauTau_mH_350_mA_30_i2HDM_V0_GEN_SIM/lperrini-higgsToZAtolltautau_350_30_V0_DIGI-2c5f164c0ecfd520d5d2801602656805-Digi.root/USER

the "-Digi.root" before /USER should not be there.

I can't say yet if this is a wrong configuration on your side or a bug in the crab pre-release.

Please send me the crab.cfg you are using for the failing crab -create.

stefano

On 03/07/2014 10:55 AM, Lucia Perrini wrote:

Ciao Stefano, one of the log is here http://analysisops.cern.ch/cmserrorreports/download/2582. Unfortunately, trying to create the 3rd round of jobs, even using crab_new.sh, even having set the dbs_url as phys03, I don't manage to create anything this time. The error is [*]. Sorry again to get longer and longer this thread..but I really don't know how to solve this problem. Thanks. Lucia

[*] Traceback (most recent call last): File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py&gt;", line 927, in crab.initialize_(options) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py&gt;", line 181, in initialize self.initializeActions(opts) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py&gt;", line 521, in initializeActions_ ncjobs) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/Creator.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/Creator.py&gt;", line 26, in init self.createJobTypeObject(ncjobs,skip_blocks,self.isNew) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/Creator.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/Creator.py&gt;", line 75, in createJobTypeObject self.job_type = klass(self.cfg_params,ncjobs,skip_blocks,isNew) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/cms_cmssw.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/cms_cmssw.py&gt;", line 261, in init blockSites = self.DataDiscoveryAndLocation(cfg_params) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/cms_cmssw.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/cms_cmssw.py&gt;", line 416, in DataDiscoveryAndLocation self.pubdata.fetchDBSInfo() File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/DataDiscovery.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/DataDiscovery.py&gt;", line 210, in fetchDBSInfo files3 = self.queryDbs3(api3,path=self.datasetPath,runselection=runselection,useParent=useparent) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/DataDiscovery.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/DataDiscovery.py&gt;", line 353, in queryDbs3 result=api.listBlocks(dataset=self.datasetPath) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py&gt;", line 522, in listBlocks return self.callServer("blocks", params=kwargs) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py&gt;", line 166, in callServer self.__parseForException(http_error) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py&gt;", line 194, in __parseForException raise HTTPError(http_error.url, data['exception'], data['message'], http_error.header, http_error.body) RestClient.ErrorHandling.RestClientExceptions.HTTPError: HTTP Error 400: Invalid Input Data /higgsToZA...: Not Match Required Format

to hn-cms-crabFee. *\ Discussion title: CRAB Feedback

thanks Lucia. Indeed also the log where you managed to publish shows the same error: 2014-03-06 17:17:56,593 [INFO] Error when listing files in DBSHTTP Error 400: Invalid Input Data /higgsToZA...: Not Match Required FormatTraceback (most recent call last):

And I see a very odd dataset name: /higgsToZAToTauTau_mH_350_mA_30_i2HDM_V0_GEN_SIM/lperrini-higgsToZAtolltautau_350_30_V0_DIGI-2c5f164c0ecfd520d5d2801602656805-Digi.root/USER

the "-Digi.root" before /USER should not be there.

I can't say yet if this is a wrong configuration on your side or a bug in the crab pre-release.

Please send me the crab.cfg you are using for the failing crab -create.

stefano

On 03/07/2014 10:55 AM, Lucia Perrini wrote:

Ciao Stefano, one of the log is here http://analysisops.cern.ch/cmserrorreports/download/2582. Unfortunately, trying to create the 3rd round of jobs, even using crab_new.sh, even having set the dbs_url as phys03, I don't manage to create anything this time. The error is [*]. Sorry again to get longer and longer this thread..but I really don't know how to solve this problem. Thanks. Lucia

[*] Traceback (most recent call last): File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py&gt;", line 927, in crab.initialize_(options) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py&gt;", line 181, in initialize self.initializeActions(opts) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/crab.py&gt;", line 521, in initializeActions_ ncjobs) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/Creator.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/Creator.py&gt;", line 26, in init self.createJobTypeObject(ncjobs,skip_blocks,self.isNew) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/Creator.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/Creator.py&gt;", line 75, in createJobTypeObject self.job_type = klass(self.cfg_params,ncjobs,skip_blocks,isNew) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/cms_cmssw.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/cms_cmssw.py&gt;", line 261, in init blockSites = self.DataDiscoveryAndLocation(cfg_params) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/cms_cmssw.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/cms_cmssw.py&gt;", line 416, in DataDiscoveryAndLocation self.pubdata.fetchDBSInfo() File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/DataDiscovery.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/DataDiscovery.py&gt;", line 210, in fetchDBSInfo files3 = self.queryDbs3(api3,path=self.datasetPath,runselection=runselection,useParent=useparent) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/DataDiscovery.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/python/DataDiscovery.py&gt;", line 353, in queryDbs3 result=api.listBlocks(dataset=self.datasetPath) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py&gt;", line 522, in listBlocks return self.callServer("blocks", params=kwargs) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py&gt;", line 166, in callServer self.__parseForException(http_error) File "/afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py <http://cern.ch/cms/ccs/wm/scripts/Crab/CRAB_2_10_4_pre3/external/dbs3client/dbs/apis/dbsClient.py&gt;", line 194, in __parseForException raise HTTPError(http_error.url, data['exception'], data['message'], http_error.header, http_error.body) RestClient.ErrorHandling.RestClientExceptions.HTTPError: HTTP Error 400: Invalid Input Data /higgsToZA...: Not Match Required Format

belforte commented 10 years ago

origin of problem understood: user has a filter in the cmssw config file, note the line: filterName = cms.untracked.string('Digi.root')

process.RAWSIMoutput = cms.OutputModule("PoolOutputModule", splitLevel = cms.untracked.int32(0), eventAutoFlushCompressedSize = cms.untracked.int32(5242880), outputCommands = process.RAWSIMEventContent.outputCommands, fileName = cms.untracked.string('REDIGI_DIGI_L1_DIGI2RAW_HLT_PU.root'), dataset = cms.untracked.PSet( filterName = cms.untracked.string('Digi.root'), dataTier = cms.untracked.string('GEN-SIM-RAW') ) )

this triggers a "never used before" piece of code in ModifyJobReport which inserts the filter name after the numerical has in the processed dataset name https://github.com/dmwm/ProdCommon/blob/master/src/python/ProdCommon/FwkJobRep/ModifyJobReport.py#L261

which fails the Lexicon and anyhow creates an odd situation where pieces of LFN are not matching block/dataset pieces as usual etc.

user managed to insert in DBS3 files like '/store/user/lperrini/05_03_14_higgsToZAllXX_DIGI/higgsToZAtolltautau_350_15/lperrini/higgsToZAToTauTau_mH_350_mA_15_i2HDM_V0_GEN_SIM/higgsToZAtolltautau_350_15_V0_DIGI/2c5f164c0ecfd520d5d2801602656805/REDIGI_DIGI_L1_DIGI2RAW_HLT_PU_1_1_lSv.root'

while the dataset is: '/higgsToZAToTauTau_mH_350_mA_15_i2HDM_V0_GEN_SIM/lperrini-higgsToZAtolltautau_350_15_V0_DIGI-2c5f164c0ecfd520d5d2801602656805-Digi.root/USER'

with the extra -Digi.root

this dataset name can not be use in input to DBS3 API's.

I think the best way to fix is to get rid of that part of ModifyJobReport and stick to ProcessedDatasetName = user-PublishName-psethash

since in any case we will not support multiple output datasets in Crab2

as a side note, I have asked DBS3 to prevent the publication of those kind of blocks, so damage is limited next time something like this happens: https://github.com/dmwm/DBS/issues/368

belforte commented 10 years ago

so this is an issue for ProdCommon https://github.com/dmwm/ProdCommon/issues/6 will need a new ProdCommon tag

belforte commented 10 years ago

new ProdCommon tag is PRODCOMMON_0_12_18_CRAB_61