nils-braun / b2luigi

Task scheduling and batch running for basf2 jobs made simple
GNU General Public License v3.0
17 stars 11 forks source link

Enhancement: Async gbasf2 submission and/or download to avoid delay of scheduling #129

Open meliache opened 3 years ago

meliache commented 3 years ago

The gbasf2 submission and dataset download operations take a long time. Even when remote workers work in parallel, scheduling happens by default in serial. (Except when the parallel_scheduling config option is set tue true. However, this didn't work for me, if you had success with it please message me.) The long gbasf2 submission and the dataset download seem to block the scheduling until that operation is done. This is something that I can live with, since usually only few gbasf2 projects are required, but it would be cool to do something about it.

This gbasf2 dataset download is currently triggered in the get_job_status method as a subroutine call when the gbasf2 project is all done. Maybe we can call initiate the download as an async subprocess and only mark the job as really complete when the download is done. At least when the gbasf2_download_dataset b2luigi option is set.

Something similar might be done for the submission.

This is not easy and I don't know if we can do both cases. The subprocess sometimes might require user input, e.g. and ca-certificate or ssh key password, so this should still work. And error handling should also be thought about. As I have not much experience with async subprocesses, I'd be happy about help.

If I'm just too stupid for parallel_scheduling and with that properly enabled these blocking operations are no problem, then this can be closed. (Though parallel_scheduling also only works for pickable tasks.)

Bilokin commented 3 years ago

This is a very interesting functionality for our project. I thought if one can split the Basf2PathTask which runs the grid into separate luigi tasks, like JobSubmissionTask, JobMonitoringTask and DatasetDownloadingTask, which might help to parallelize the code, if that makes sense

Bilokin commented 2 years ago

Hi @meliache,

this ticket is the closest to the topic I would like to raise. The gbasf2 project submission algorithm does not submit all projects first and then waits for them to finish, but rather some project submissions happen after start of the project monitoring. This is not optimal and we need to ensure that all gbasf2 have been submitted at the start of the b2luigi process. I am still not sure why this happens, but do you have an idea how to fix the issue?