dmwm / das2go

Go implementation of Data Aggregation System (DAS) for CMS experiment
MIT License
2 stars 3 forks source link

DAS problem #23

Closed mjeitler closed 5 years ago

mjeitler commented 5 years ago

I am looking for CMS data using this query dataset=/RelValTTbar_13//GEN-SIM site=T1_CH_CERN and I am told there are no data available, which I cannot believe. When I just query for dataset=/RelValTTbar_13//GEN-SIM I get lots of records but all the files I have tried to find using eos ls -alF /eos/cms/store/..... do not exist. How can I find data with really existing files? Even adding status=VALID* does not help.

Thanks for your help! Manfred Jeitler

vkuznet commented 5 years ago

Manfred, it is nothing wrong with DAS per-se, it is issue with your dataset/file discovery. Why are you assuming that your data resides on T1_CH_CERN? The discovery process should be the following:

I don't know what you're doing and can't guide you more, but this is pretty simple and logical recipe. And for files download I doubt that you can locate them at EOS, instead you should request to transfer those to your site via phedex or use xrdcp command.

Closing the ticket since it is not issue with DAS.

mjeitler commented 5 years ago

Dear Valentin,

Thank you for the reply! I am not assuming anything, I just need some RAW data to produce test vectors for the Global Trigger of CMS.

I would like to access them from CMSSW (preferably without copying if possible - but if it’s easier to copy them I can do that, too) like this:

==== process.source = cms.Source("PoolSource", secondaryFileNames = cms.untracked.vstring(), fileNames = cms.untracked.vstring( '/store/relval/CMSSW_10_1_0/RelValTTbar_13/GEN-SIM-DIGI-RAW/101X_upgrade2018_design_v7_resub-v1/10000/5C7E71FC-E037-E811-B0B4-0CC47A78A468.root',

I have tried to proceed as you suggest: Typing

site dataset=/RelValTTbar_13//RAW*

yields, among others,


Site name: T2_CH_CERNhttps://cmsweb.cern.ch/das/request?input=site%3DT2_CH_CERN&instance=prod/global StorageElement: srm-eoscms.cern.chhttp://srm-eoscms.cern.ch Datasetshttps://cmsweb.cern.ch/das/request?instance=prod/global&input=dataset+site%3DT2_CH_CERN%2A Sources: phedex hide

but when I then try

dataset=/RelValTTbar_13//RAW* site=T2_CH_CERN

I get

No results found DAS unable to find any results for your query. Please revisit your query by reviewing DAS query guidehttps://cmsweb.cern.ch/das/faq or submit a DAS github issuehttps://github.com/dmwm/das2go/issues/new to resolve your query request.

So I am a bit stuck. Sorry but I am new to DAS and would just like to find ANY files satisfying my requirements ( dataset=/RelValTTbar_13//RAW* ), preferably sitting at CERN. If there are really none at CERN, can I access them by CMSSW as shown above or do I have to copy them?

Thanks a lot for your help!

Manfred


/\ \/\ \/\ \/\ \ / _\ _\ _\ _\ Manfred Jeitler \ // / / / / / / \//\ \/\//\// CERN EP, mailbox E02400, / _\ / _\ CH-1211 Geneva 23, Switzerland \ / / \ / / office +41 22 767 6307 \//\ \/\ \// office location: building 21, R-029 / __/ _\ mobile +41 75 411 0862 \ / \ / / (16-0862 from inside CERN) \//\ \/__/ home +33 4 50 40 66 56 / \\ fax +41 22 766 7967 \ / / \/___/

On 4 Sep 2019, at 14:14, Valentin Kuznetsov notifications@github.com<mailto:notifications@github.com> wrote:

Manfred, it is nothing wrong with DAS per-se, it is issue with your dataset/file discovery. Why are you assuming that your data resides on T1_CH_CERN? The discovery process should be the following:

dataset=/RelValTTbar_13//GEN-SIM

site dataset=/RelValTTbar_13_UP18/CMSSW_11_0_0_pre6-110X_upgrade2018_realistic_v3_FastSim-v1/GEN-SIM-DIGI-RECO

dataset dataset=/RelValTTbar_13//GEN-SIM site=T2_CH_CERN

file dataset=/RelValTTbar_13_UP18/CMSSW_11_0_0_pre6-110X_upgrade2018_realistic_v3_FastSim-v1/GEN-SIM-DIGI-RECO site=T2_CH_CERN

etc.

I don't know what you're doing and can't guide you more, but this is pretty simple and logical recipe. And for files download I doubt that you can locate them at EOS, instead you should request to transfer those to your site via phedex or use xrdcp command.

Closing the ticket since it is not issue with DAS.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dmwm/das2go/issues/23?email_source=notifications&email_token=ADUOYWUS4DFYWYF7O67VUPLQH6RDJA5CNFSM4ITQRNE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD53LPNY#issuecomment-527873975, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADUOYWTT74Y6PBCM6ER26EDQH6RDJANCNFSM4ITQRNEQ.

vkuznet commented 5 years ago

Manfred, DAS has nothing to do how you'll use your data. DAS is a tool to discovery data but nothing else. With that in mind what you need to know is a little of DAS query language. The site query you place for dataset pattern is not supported since pattern can spawn many-many datasets and query will become slow. I provided a recipe for you and you should follow it. Look-up your datasets from dataset pattern, then for each dataset you are interested you can find a site (the site query requires full dataset path), and then you can find files you need.