blaylockbk / goes2go

Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
https://goes2go.readthedocs.io/
MIT License
193 stars 33 forks source link

Use pooch to manage the file downloads #50

Closed ocefpaf closed 1 year ago

ocefpaf commented 1 year ago

There are many advantages from using pooch instead of managing a libs own downloader. The main one is a checksum/version check. See https://pypi.org/project/pooch/ for more info.

This is something I can work on if there is interested.

blaylockbk commented 1 year ago

That would be awesome! If I understand what pooch does, it will prevent redownloading data if the data has already been downloaded, and just makes downloading simpler in general.

I had thought about wrapping rclone for the checksum, but this idea sounds better. It looks like MetPy also uses pooch, and that makes me feel good about using it here.

Yeah, I'd appreciate any contributions you would like to provide. It'll give me more experience working with pull requests and contributors, too. Thanks for offering 😊

ocefpaf commented 1 year ago

After thinking about this for a while, the one thing I wanted from pooch was a checksum b/c sometimes the download fails and goes2go crashes b/c it tries to read an incomplete file. However, we can achieve that by other means and the current download/file management scheme here is good and doesn't need a re-write with pooch.

I'll open a new issue about the checksum later.