emissions-api / sentinel5dl

Sentinel-5(P) Downloader
https://sentinel5dl.emissions-api.org
MIT License
12 stars 8 forks source link

Delete half-finished files #52

Closed lkiesow closed 4 years ago

lkiesow commented 4 years ago

In case the download fails for any reason and we end up with a half-finished file, it would be nice to clean up and remove the file

baevpetr commented 4 years ago

Hi @lkiesow. Pycurl has .RESUME_FROM. Maybe instead of trying to download a file from the beginning, try to continue downloading with retries? And maybe download with threads? To ensure that long-loading files, the download of which must be restarted, do not slow down the process of loading data in general. Standard threadingshould be enough, as these are ordinary IO operations. Maybe open a new issue for this.

lkiesow commented 4 years ago

Hi @baevpetr, a resume would indeed be a nice option though my guess would be that the server would need to support byte-range requests for it to work. No idea if the ESA backend does support that but we could certainly tray. Are you by any chance interested in trying to implement this? ;-P


About the threading, can you create an issue for that? In general, I think this makes sense and it's what I'm already doing when downloading large data sets by just downloading 4-5 parts in parallel. We should probably also limit the download to a certain amount of "workers" since downloading very large sets would otherwise cause the single connections to be more fragile and could result in more failing connections.

lkiesow commented 4 years ago

@baevpetr, with pull request #57 @shaardie introduced parallel downloads.