Closed belforte closed 1 year ago
Stefano, this query is complicated as it runs through series of DBS API. But as far as I looked up in a code it does the following:
file,run,lumi data=set/a/b/c
resolves into finding blocks and then for every block we look-up file,run,lumi triplets. Said that, it seems it quries by default blocks with all files. But in order to pass valid file status, you should change the query to
file,run,lumi data=set/a/b/c status=valid
the status
is DAS keyword to specify file status. Remember the DAS query is composition of <select keys> <conditions>
, therefore you select file, run, lumi
and apply conditions dataset=/a/b/c
and status
.
I'm on a break now and will not spend time on it until I back. You may try it with status
to see the difference, if it will produce the same results we'll need to resolve all APIs calls to DBS to see how it is done. You can do it too by adding -verbose=2
argument to dasgoclient and you'll see all URL calls it does.
THANKS VERY MUCH. I will try. Sorry to have bugged at the wrong time.
file,run,lumi dataset/a/b/c status=valid
works like a charm !!!! I tested on a dataset with invalid files (of course).
Thanks you Valentin :bowing_man:
I found a more fundamental problem with
file,run,lumi dataset/a/b/c
the output has one entry per file (OK) but the for each file there is a list of run numbers and one uncorrelated list of lumis. While of course one needs the list or proper (run,lumi) pairs in whatever format.
There is no problem with run,lumi dataset=/a/b/c
since it produces one entry per run with one list of lumi in each, but I can't filter on file status in that.
I guess I am sticking with listing (valid) files first, and lumis in each second.
The use case for this is marginal (a little used CRABClient functionality), so I think there is no point in "fixing".
I am not sure it it is better to ask here, or in cms-talk, in case advise, thanks !
when I issue
does it list runs and lumis from all files, or only valid ones (
is_file_valid=1
) ?In case, is there a dasgoclient syntax which allows to restrict things that way, or is the only way to make a list of valid files first and a ton of dasgoclient queries after ?
I tried
with or w/o
--json
at the right, but output is the same as if omittingis_file_valid
Also I can't use
| grep file.xxx
I presume because the file dictionary in the output ofonly contains the file name, differently from the query
file dataset=...