Closed meliache closed 3 years ago
Sure, I can test it.
Unfortunately, it doesn't work. I think we need a unit test for the get_unique_lfns
function, as it is not so trivial.
I have these real-life examples:
grid_list = ['Upsilon4SBpcandee_00000_job181817516_00.root',
'Upsilon4SBpcandee_00001_job181817517_00.root',
'Upsilon4SBpcandee_00002_job181817518_00.root',
'Upsilon4SBpcandee_00003_job181817519_00.root',
'Upsilon4SBpcandee_00004_job181817520_00.root',
'Upsilon4SBpcandee_00005_job181817521_00.root',
'Upsilon4SBpcandee_00006_job181817522_00.root',
'Upsilon4SBpcandee_00007_job181817523_00.root',
'Upsilon4SBpcandee_00008_job181817524_00.root',
'Upsilon4SBpcandee_00009_job181817525_00.root',
'Upsilon4SBpcandee_00010_job181817526_00.root',
'Upsilon4SBpcandee_00011_job181817527_00.root',
'Upsilon4SBpcandee_00012_job181817528_00.root',
'Upsilon4SBpcandee_00012_job181817528_02.root',
'Upsilon4SBpcandee_00013_job181817529_00.root',
'Upsilon4SBpcandee_00014_job181817530_00.root',
'Upsilon4SBpcandee_00015_job181817531_00.root',
'Upsilon4SBpcandee_00016_job181817532_00.root',
'Upsilon4SBpcandee_00017_job181817533_02.root',
'Upsilon4SBpcandee_00018_job181817534_01.root',
'Upsilon4SBpcandee_00019_job181817535_00.root',
]
Oh, thanks for noticing that. I thought I had tested it on the command line with some example output but I now realized that my output name contains underscores, e.g. B_ntuple.root
. My _get_lfn_upto_reschedule_number()
worked for that, not for your case, as I kind of counted the underscores from the beginning:
"_".join(lfn.split("_")[0:4])
I also do dumb mistakes. Now I changed that to to the following:
"_".join(lfn.split("_")[:-1])
which works in both cases:
In [69]: "_".join('u4s_B_candee_00019_job181817535_00.root'.split("_")[:-1])
Out[69]: 'u4s_B_candee_00019_job181817535'
In [70]: "_".join('Upsilon4SBpcandee_00019_job181817535_00.root'.split("_")[:-1])
Out[70]: 'Upsilon4SBpcandee_00019_job181817535'
Definetly should make a unittest for that, because this is a case that is easily testable. Thanks for notifying me.
Thanks, after this change I can confirm that my downloads are finishing successfully
I added some unit tests for my LFN helper functions introduced here and reverted a commit from another feature branch that accidentally made it into this.
For checking if a grid dataset has been completely downloaded, I compare the output of
gb2_ds_list
(lists the remote dataset) with the list of downloaded files. However, since gbasf2 release 5,gb2_ds_list
also lists duplicate outputs from jobs that had been rescheduled. Thegb2_ds_get
download script filters those out, but with this, I can't simply use the comparison anymore.Thus, via this PR I manually filter out the duplicated remote LFN's and only show those with the highest number. For the logic, I adapted some of the code of gbasf2/BelleDirac, in particular the
findDuplicatedJobID()
function:Sadly, it also just uses string parsing. I was a little worried that it might brake for old datasets produced with old gbasf2 versions, but I try to recognize if the job LFN is the new version and if it's not I don't remove anything.
There is a gbasf2 issue to do this filtering automatically, see BIIDCD-1240, so after that's merged this code might be simplified.
My wish is that it will be possible to just rely on the exit code of
gb2_ds_get
to know if a dataset has been successfully and completely downloaded. This whish has been also formulated into an issue, see BIIDCD-1238.As far as I know @bilokin already had a workaround by disabling the comparison of datasets altogether. Maybe he or some other people could test this. Before doing this fix, I had merged the PR #67 which refactored the download function and made the partial download directories persistent, which made debugging of the download and the implementation of changes much simpler. However, for that PR I hadn't waited for reviews of other people. Maybe it would be good if others could check if their b2luigi works for the head of this branch. Remember, you can just install it for testing, e.g. with