allenai / satlas

Apache License 2.0
184 stars 19 forks source link

Extremely long download time #32

Closed loicland closed 5 months ago

loicland commented 6 months ago

Hi, thanks for the great dataset!

We are trying to download the data, but even with a good connection it seems like it would take litteral months of uninterrupted downloading. The s2a link for example takes 45 days to download. Is there a solution for that?

Another question is that we are only interested in the areas where we have both sentinel-2 and NAIP images (and possibly sentinel-1). Is there a way to only download s2 tiles that overlap with NAIP data?

favyen2 commented 6 months ago

It is hosted on S3 so you should be able to start many parallel downloads to increase the download speed.

It is very big dataset so it is difficult to host in multiple places. We do still want to host on HuggingFace but I still haven't gotten around to it because there are some steps like splitting up into many small archive files.

Right now we don't have a version for just the Sentinel-2 tiles intersecting NAIP data. If you just need images then https://github.com/allenai/satlas-super-resolution/ has many aligned NAIP and Sentinel-2 images, but if you need the labels too then currently the only option is the larger download.

favyen2 commented 5 months ago

The dataset can now be downloaded from Hugging Face.