Closed matz-e closed 6 years ago
Matthias, with a switch from das_client (python) to dasgoclient (Go-based) I targeted that tool should provide maximum speed of the queries. As such the detailed output of queries is disabled by default. What you need is to enable it. This can be done as following:
file dataset=/JetHT/Run2017C-v1/RAW detail=True | grep file.nevents
Basically you add to a query detail=True which trigger to fetch detailed output from DBS (it more costly and therefore takes longer to fetch and process).
With detail=True option I found a compromise between various user use cases. For those who most of the time only care about speed of queries it is disabled by default, for other it is there if you'll adjust your query.
Please try it out and either provide feedback or close the ticket.
Best, Valentin.
On 0, Matthias Wolf notifications@github.com wrote:
Filtering seems to be broken, and requesting JSON output after filtering gives invalid JSON:
(master U:1 ?:5 ✗) » dasgoclient -query "file dataset=/JetHT/Run2017C-v1/RAW | grep file.nevents"|head [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/60E64405-E073-E711-BEC4-02163E01A4B1.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/2AF23E03-E073-E711-B613-02163E014106.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/6ADB581F-E073-E711-898B-02163E0134D6.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/E0457FCC-E073-E711-ACA2-02163E01A2DA.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/2CCEFBD0-E073-E711-8020-02163E011884.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/9653BC7A-E173-E711-8929-02163E0118E2.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/079/00000/62D99022-F073-E711-A3CA-02163E019C12.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/079/00000/D2D6A5D9-F273-E711-89F7-02163E01A208.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/079/00000/92276AFA-F273-E711-9291-02163E013504.root"}] [{"name":"/store/data/Run2017C/JetHT/RAW/v1/000/300/079/00000/244FFCFE-F273-E711-8057-02163E019BC9.root"}] (master U:1 ?:5 ✗) » dasgoclient -query "file dataset=/JetHT/Run2017C-v1/RAW | grep file.name" -json|head [ /store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/60E64405-E073-E711-BEC4-02163E01A4B1.root , /store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/2AF23E03-E073-E711-B613-02163E014106.root , /store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/6ADB581F-E073-E711-898B-02163E0134D6.root , /store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/E0457FCC-E073-E711-ACA2-02163E01A2DA.root , /store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/2CCEFBD0-E073-E711-8020-02163E011884.root , /store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/9653BC7A-E173-E711-8929-02163E0118E2.root , /store/data/Run2017C/JetHT/RAW/v1/000/300/079/00000/62D99022-F073-E711-A3CA-02163E019C12.root , /store/data/Run2017C/JetHT/RAW/v1/000/300/079/00000/D2D6A5D9-F273-E711-89F7-02163E01A208.root , /store/data/Run2017C/JetHT/RAW/v1/000/300/079/00000/92276AFA-F273-E711-9291-02163E013504.root ,
-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmwm/dasgoclient/issues/13
Valentin,
Thanks for the feedback! I've tried the option out (I didn't notice it before, sorry), and it works for non-json output. I still think the following changes could be added:
detail=True
if a filter is involved. This would make sense IMHO, since filtering like | grep file.nevents
implies that more than basic information is needed, and would seem more natural?-json
switch.Thanks, Matthias
Matthias, thanks for good suggestion. I implemented your first bullet, the auto-detection of detailed output when user provide a filter, see https://github.com/dmwm/das2go/commit/114fc26950cc144131be41ae02d3b55c7baf886e
But your second suggestion and usage of filter in json output is much more cumbersome. The problem is that when user ask for json the output should be parse-able JSON output. Adding a string error/warning to it will break this. The filter itself does not break the query either, therefore throwing an error message can be confusing at least. I'm in favor of returning full detailed output and ignore filter in json output. I need to think about it. Let me know your thoughts on it if you want.
I'll think about your second item for a bit and then either schedule new version without it (but with a provided fix for first bulet) or add more code to handle error or warning.
Best, Valentin.
On 0, Matthias Wolf notifications@github.com wrote:
Valentin,
Thanks for the feedback! I've tried the option out (I didn't notice it before, sorry), and it works for non-json output. I still think the following changes could be added:
- Add
detail=True
if a filter is involved. This would make sense IMHO, since filtering like| grep file.nevents
implies that more than basic information is needed, and would seem more natural?- Print out some error message if invalid combinations of options are used, in this case if a filter is used together with the
-json
switch.Thanks, Matthias
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/dmwm/dasgoclient/issues/13#issuecomment-330523888
Matthias, requested changes to add detail=True in queries with filters is committed to main repository. See build: https://github.com/cms-sw/cmsdist/pull/3453 Once it's done changes will appear in all CMSSW releases. Best, Valentin.
On 0, Matthias Wolf notifications@github.com wrote:
Valentin,
Thanks for the feedback! I've tried the option out (I didn't notice it before, sorry), and it works for non-json output. I still think the following changes could be added:
- Add
detail=True
if a filter is involved. This would make sense IMHO, since filtering like| grep file.nevents
implies that more than basic information is needed, and would seem more natural?- Print out some error message if invalid combinations of options are used, in this case if a filter is used together with the
-json
switch.Thanks, Matthias
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/dmwm/dasgoclient/issues/13#issuecomment-330523888
Great, thank you! For the second point, I forgot to stress that using a filter and requesting JSON output format does not give valid JSON:
(master {1} ?:2 ✗) » ./dasgoclient -query "file dataset=/JetHT/Run2017C-v1/RAW | grep file.name" -json|head
[
/store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/60E64405-E073-E711-BEC4-02163E01A4B1.root ,
The filename should actually be surrounded by quotes. I would still suggest having dasgoclient
fail when a filter is used with -json
.
Thanks, Matthias
Matthias, which version of dasgoclient are you using in this example. I just did on lxplus:
dasgoclient -query="file dataset=/ZMM/Summer11-DESIGN42_V11_428_SLHC1-v1/GEN-SIM" -json [ {"das":{"expire":1506519335,"instance":"prod/global","primary_key":"file.name","record":1,"services":["dbs3:files_via_dataset"]},"file":[{"name":"/store/mc/Summer11/ZMM/GEN-SIM/DESIGN42_V11_428_SLHC1-v1/0003/02ACAA1A-9F32-E111-BB31-0002C90B743A.root"}],"qhash":"f28499f53016567764d7ba76a3d76e4a"} , {"das":{"expire":1506519335,"instance":"prod/global","primary_key":"file.name","record":1,"services":["dbs3:files_via_dataset"]},"file":[{"name":"/store/mc/Summer11/ZMM/GEN-SIM/DESIGN42_V11_428_SLHC1-v1/0003/24DAE39C-AC32-E111-AED8-0002C90B743A.root"}],"qhash":"f28499f53016567764d7ba76a3d76e4a"} , ...
and it produces proper json, i.e. it yields full DAS records preserving data-types.
Best, Valentin.
On 0, Matthias Wolf notifications@github.com wrote:
Great, thank you! For the second point, I forgot to stress that using a filter and requesting JSON output format does not give valid JSON:
(master {1} ?:2 ✗) » ./dasgoclient -query "file dataset=/JetHT/Run2017C-v1/RAW | grep file.name" -json|head [ /store/data/Run2017C/JetHT/RAW/v1/000/300/064/00000/60E64405-E073-E711-BEC4-02163E01A4B1.root ,
The filename should actually be surrounded by quotes. I would still suggest having
dasgoclient
fail when a filter is used with-json
.Thanks, Matthias
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/dmwm/dasgoclient/issues/13#issuecomment-332488113
It should be the newest from master
, I compiled it locally. Note that your query does not have a filter. I don't consider this very important, so I can also close this issue if you deem this not worth fixing.
Hmm, strange, since I do use my master and I don't see this issue (mailformed json output). Please note that you need to have das2go package too which I periodically update. Could you please try to check/update/download latest versions of das2go and dasgoclient and compile later again and try it out.
I do care about issues and broken json is certainly worth of fixing. But as far as I can tell it is not the case in my setup. So, it's worth to double check at your end.
Best, Valentin.
On 0, Matthias Wolf notifications@github.com wrote:
It should be the newest from
master
, I compiled it locally. Note that your query does not have a filter. I don't consider this very important, so I can also close this issue if you deem this not worth fixing.-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/dmwm/dasgoclient/issues/13#issuecomment-332580153
I fixed these issues and new dasgoclient is available in CMSSW releases.
Filtering seems to be broken, and requesting JSON output after filtering gives invalid JSON: