nils-braun / b2luigi

Task scheduling and batch running for basf2 jobs made simple
GNU General Public License v3.0
17 stars 11 forks source link

Fix moving downloaded gbasf2 output files for projects with multiple data-blocks (subs) #148

Closed meliache closed 3 years ago

meliache commented 3 years ago

For multiple gbasf2 output subs, b2luigi moves subsequent sub-directories into the output-dir instead of just their contents

Given the grid output of a job contains multiple data-blocks (each with up to a 1000 output files) of ntuple-files, e.g. for an output named B.root:

<gbasf2 project name>/sub00/job_name*B.root
<gbasf2 project name>/sub01/job_name*B.root
<gbasf2 project name>/sub02/job_name*B.root

Then b2luigi will move the contents of sub00 to

<result_dir>/B.root/job_name*B.root

However, for the other subs, because the B.root directory already exists, the shutil.move will move the outputs into

<result_dir>/B.root/sub01/job_name*B.root
<result_dir>/B.root/sub02/job_name*B.root

Thanks for @eckerpatrick for pointing this out to me.

Testing with multiple data-blocks on the grid is not easy because the downloading can take over half a day. But I still think I should put the moving functionality into an own function and test-it with some empty dummy .root files. As this bug demonstrates, sometimes just writing some quick code and thinking there is nothing that could go wrong is no enough 🙈

Bilokin commented 3 years ago

Hi @meliache, #147 is a duplicate of this ticket. I created a ticket without Assignee field filled, and then I was confused by the github interface to correct this.

meliache commented 3 years ago

Oh sorry, I somehow totally overlooked #147 . I don't think you have the rights to assign people to issues, but pinging me in case of an absence of answers (via @ or the usual chat/email channels) is the right thing to do :+1:

Bilokin commented 3 years ago

Ok, thanks. I will test the solution today