Progress messages for add_keyfiles command; other slow steps

Arcadia-Science / glial-origins

Evolutionary origins and relationships of glial cell types

MIT License

2 stars 0 forks source link

Progress messages for add_keyfiles command; other slow steps #42

Closed jasegehring closed 1 year ago

jasegehring commented 1 year ago

I believe this command frequently can take a long time to run. I'm getting curl feedback on download time, but a little printed message about what the function is doing would be nice. "Downloading Genome File..." "Downloading GTF File..." etc

Example my genome download from Ensembl was only going at 0.5 Mb per second, and it would be nice to have that little feedback during potentially long-running compute tasks

mezarque commented 1 year ago

The add_keyfiles command should be very quick, as it's simply modifying the BioFileDocket object to put specific keyfiles under specific attribute values. Perhaps this is referring to some aspect of the get_from_url or s3_transfer and related downloading functions? Since these use subprocess.run(), we should be able to add a flag that outputs stdout or stderr to the Jupyter notebook output. We can also add additional print() statements throughout download functions in order to keep users apprised of what's happening.

jasegehring commented 1 year ago

^^ yup that's what i'm getting at. I always appreciate dynamic feedback for long steps. print statements, progress bars, progress trackers, etc. just to know someone is still there

the code is pretty nested, so printed feedback is helpful to know what's executing.

For example I added rsync functionality and included the -P flag for progress tracking. I could also add a print statement that says "downloading xx file"