allenai / satlas

Apache License 2.0
184 stars 19 forks source link

A problem about the size of dataset #1

Closed cbachen1997 closed 1 year ago

cbachen1997 commented 1 year ago

Dear esteemed authors,

I would like to extend my appreciation for your hard work in releasing the Satlas dataset. I am highly interested in utilizing some of the Sentinel-2 data for my research and have encountered a challenge in the process of downloading the data.

The provided compressed file is quite large, with a size of 4TB, which requires a considerable amount of storage space. Furthermore, the download process may be susceptible to interruptions.

I was wondering if it would be possible to provide an alternative method of downloading the data in smaller, separate volumes, making it easier to access the specific data people require.

Thank you for your time and consideration.

favyen2 commented 1 year ago

The download should be resumable so that it can be restarted without losing progress through the Range HTTP header. Most clients like wget (with --continue flag) support resuming interrupted download in this way.

But we plan to release updated dataset in the next few months, and at that time we will consider having an alternative option with many small archives, or maybe a small version of the dataset. Thanks for the suggestion!

cbachen1997 commented 1 year ago

Thank you very much for taking my suggestion into consideration.

Best wishes to you.