nils-braun / b2luigi

Task scheduling and batch running for basf2 jobs made simple
GNU General Public License v3.0
17 stars 11 forks source link

Output directories not created when running tasks in test-mode #171

Open meliache opened 2 years ago

meliache commented 2 years ago

create_output_dirs is called explicitly for the local batch system in SendJobWorker._create_task_process. However, it's not called for the remote batch systems and the test batch system. I assume that for the remote batch systems, it's is called elsewhere. For example it's called in run_as_batch_worker. In addition, that function sets the _dispatch_local_execution setting, which causes create_output_dirs to also be called in the dispatch function, which to me seems a bit redundant. However, it seems to me that this function call might be missed when running test tasks locally, so they might fail, saying that the output directory does not exist.

I also found that it's explicitly called in the definition of MergerTask. It's either redundant or necessary, but if it's necessary there, to me it seems it should not be, because it's documentation says "Normally only used internally", so I don't think it should be necessary to call this explicitly in your tasks.

@itsaklid reported the issue to me, at least saying that for one of his tasks the output directory wasn't created. I'm not entirely sure that my hypothesis is right regarding for which type of task and run-method this happens. But I remember having seen something very similar already a long time ago.

It would be good to check why this error does not occur for our unit tests, check if we can create a minimal reproducible example and add that to our unit tests. And also check if we can remove some of the redundant calls of that method. @itsaklid, if you have a minimal example, I'd be happy if you could share it. Also, maybe @nils-braun has some background knowledge on why it's called explicitly in the MergerTask, why in SendJobWorker it's only called for the local tasks etc, since that is not really documented and I have to hope to correctly guess the behaviour from reading the code, but I'm not sure if my interpretation is right.