ECMWFCode4Earth / ml_drought

Machine learning to better predict and understand drought. Moving github.com/ml-clim
https://ml-clim.github.io/drought-prediction/
89 stars 18 forks source link

Request to Use Axel instead of Wget for Exporters #127

Open v2thegreat opened 4 years ago

v2thegreat commented 4 years ago

Hey! I noticed that the download speed for the exporters was a bit slow compared to what we've seen be used in our pipeline. Have you considered using something like Axel that'll parallelize the downloads across multiple threads? I see that this is something that's already done here, but there is Pythonic overhead involved that might be better utilized somewhere else.

Looking at how you've done it in src/exporters/chirps.py, it seems that it should only require modifying this line to speed up the downloads with the correct configuration of axel to get the same results.

Finally, seeing as how downloading the data is an important part of the pipeline, it might help speed up the overall process substantially as the project grows to include other datasets as needed in the future.

gabrieltseng commented 4 years ago

Hi!

Axel seems very interesting - we'll take a look! We do want to minimize the amount of dependencies in the pipeline, so we might not integrate axel straight away.

Thank you!

tommylees112 commented 4 years ago

This is really great of you to take an interest in the pipeline @v2thegreat ! Do you work with environmental data often? How would you like to use the pipeline?