dmwm / dasgoclient

Data Aggregation System (DAS) Go client
https://cmsweb.cern.ch/das/
MIT License
9 stars 4 forks source link

Bug in wildcard search #12

Closed kpedro88 closed 7 years ago

kpedro88 commented 7 years ago

I have noticed some bad behavior for wildcards in certain positions. After testing a bunch of cases, I think the problem occurs whenever one of the three fields in a dataset name /A/B/C is just a wildcard. Such commands sit for 30-60 seconds and then start printing every dataset name known to DAS (many thousands). Examples:

dasgoclient -query="dataset=/*/Run2017A-PromptReco-v1/MINIAOD"
dasgoclient -query="dataset=/MET/*/MINIAOD"
dasgoclient -query="dataset=/MET/Run2017A-PromptReco-v1/*"

In contrast, a command like this works fine:

> dasgoclient -query="dataset=/MET/Run2017*/MINIAOD"
/MET/Run2017A-PromptReco-v1/MINIAOD
/MET/Run2017A-PromptReco-v2/MINIAOD

The failing commands work fine on both the Python client and the web interface:

> das_client --query="dataset=/*/Run2017A-PromptReco-v1/MINIAOD"

Showing 1-10 out of 49 results, for more results use --idx/--limit options

/Charmonium/Run2017A-PromptReco-v1/MINIAOD
/Commissioning/Run2017A-PromptReco-v1/MINIAOD
/Commissioning1/Run2017A-PromptReco-v1/MINIAOD
/Commissioning2/Run2017A-PromptReco-v1/MINIAOD
/Commissioning3/Run2017A-PromptReco-v1/MINIAOD
/Commissioning4/Run2017A-PromptReco-v1/MINIAOD
/CommissioningDoubleJet/Run2017A-PromptReco-v1/MINIAOD
/CommissioningEGamma/Run2017A-PromptReco-v1/MINIAOD
/CommissioningMuons/Run2017A-PromptReco-v1/MINIAOD
/CommissioningSingleJet/Run2017A-PromptReco-v1/MINIAOD

https://cmsweb.cern.ch/das/request?view=list&limit=50&instance=prod%2Fglobal&input=dataset%3D%2F*%2FRun2017A-PromptReco-v1%2FMINIAOD

I'm using whatever version of the client is provided at /cvmfs/cms.cern.ch/common/dasgoclient. (Maybe there should be a -version flag added?)

vkuznet commented 7 years ago

Kevin, this is known feature due DBS server API bug. It should be fixed by next Tuesday with cmsweb upgrade. The problem is that das web and python client decompose pattern query into sub parts (primary , processed and data tier ones) and then send request to DBS. The go client uses different DBS api where but was introduced. So wait till next week.

And, I already submitted version flag to go client, so it is in CMS build and will be available soon.

Best Valentin

On Jun 9, 2017, at 16:07, Kevin Pedro notifications@github.com wrote:

I have noticed some bad behavior for wildcards in certain positions. After testing a bunch of cases, I think the problem occurs whenever one of the three fields in a dataset name /A/B/C is just a wildcard. Such commands sit for 30-60 seconds and then start printing every dataset name known to DAS (many thousands). Examples:

dasgoclient -query="dataset=//Run2017A-PromptReco-v1/MINIAOD" dasgoclient -query="dataset=/MET//MINIAOD" dasgoclient -query="dataset=/MET/Run2017A-PromptReco-v1/*" In contrast, a command like this works fine:

dasgoclient -query="dataset=/MET/Run2017*/MINIAOD" /MET/Run2017A-PromptReco-v1/MINIAOD /MET/Run2017A-PromptReco-v2/MINIAOD The failing commands work fine on both the Python client and the web interface:

das_client --query="dataset=/*/Run2017A-PromptReco-v1/MINIAOD"

Showing 1-10 out of 49 results, for more results use --idx/--limit options

/Charmonium/Run2017A-PromptReco-v1/MINIAOD /Commissioning/Run2017A-PromptReco-v1/MINIAOD /Commissioning1/Run2017A-PromptReco-v1/MINIAOD /Commissioning2/Run2017A-PromptReco-v1/MINIAOD /Commissioning3/Run2017A-PromptReco-v1/MINIAOD /Commissioning4/Run2017A-PromptReco-v1/MINIAOD /CommissioningDoubleJet/Run2017A-PromptReco-v1/MINIAOD /CommissioningEGamma/Run2017A-PromptReco-v1/MINIAOD /CommissioningMuons/Run2017A-PromptReco-v1/MINIAOD /CommissioningSingleJet/Run2017A-PromptReco-v1/MINIAOD https://cmsweb.cern.ch/das/request?view=list&limit=50&instance=prod%2Fglobal&input=dataset%3D%2F*%2FRun2017A-PromptReco-v1%2FMINIAOD

I'm using whatever version of the client is provided at /cvmfs/cms.cern.ch/common/dasgoclient. (Maybe there should be a -version flag added?)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

kpedro88 commented 7 years ago

Great, thanks for the quick reply.

vkuznet commented 7 years ago

Kevin, could you please try your examples now. Today the cmsweb upgraded has been done and this issue should be gone. I just tried to get results for dataset=/*/Run2017A-PromptReco-v1/MINIAOD query and it returns me only MINIAOD datasets. I just want to double check before closing the issue.

kpedro88 commented 7 years ago

Yes, it works now. Thanks!