Open Slugger70 opened 6 years ago
I have taken a quick look. This problem is solvable, but it requires setup_data_libraries.py to make full use of the bioblend api. In the library module there is a wait for dataset option that seems to be sufficient for this use case. It includes a timeout as well. I will try a quick fix, but if this does not work I am afraid the whole script needs to be refactored, which may take some time.
It was as I feared. The entire script should be refactored to properly make use of the bioblend API. This will allow for keeping track of the library and dataset IDs throughout the script so that the wait_for_dataset
option can be called.
I propose that a more object-oriented method is followed as in the get-tool-list function. Now it is a sequence of functions that is hard to keep track of.
Anyone who has spare time on their hands who would like to do this?
I'm currently working on it a bit to make sure it doesn't re-create libraries and re-upload datasets in them that already exist.
:+1: feel free to make a pull request.
I have an issue with
setup_data_libraries.py
. When run, if there are any jobs on the target Galaxy server that are not in anok
state, the script never finishes as it is stuck in a loop waiting for ALL jobs on the Galaxy server to be in that state.If the target Galaxy server in question has been around for some time then there will more than likely be some jobs in a
new
orerror
state let alone other non library creation related jobs still running.I think it would be much better to capture the upload job id's for each upload in a list and just wait for them to complete.