ITC-CRIB / fairly

A package to create, publish, and clone research datasets
https://fairly.readthedocs.io
MIT License
19 stars 6 forks source link

Add the possibility to add a random sleep when storing remote datasets #40

Open d-consoli opened 1 year ago

d-consoli commented 1 year ago

Hi, thanks for your work, it is a really useful tool! Unfortunately, when using it to download datasets from Zenodo that contain several (small) files, Zenodo receives too many requests and stops the download. This can be easily solved, for instance, by adding a random sleep in the for loop for downloading the remote datasets, and this is what I did in my fork of your project in this commit. If you think it can be a useful feature and you agree in the way I implemented it I can add an ad-hoc test and create a pull request to your repository, or alternatively search for a better solution. Thanks for your help!

girgink commented 1 year ago

Hi @d-consoli, thanks for suggestion! Surely it is a useful feature. For another project I developed a similar code in the past, which waits for a random time period and does this in a loop, so that if the waiting time is not sufficient it waits for a longer time until the file becomes available or a maximum number of tries is reached. I can add it to the store method. Meanwhile please feel free to create a pull request for your solution if you wish.