earthpulse / eotdl

Earth Observation Training Datasets
https://eotdl.com
MIT License
17 stars 6 forks source link

Data access upgrades #168

Closed juansensio closed 2 months ago

juansensio commented 4 months ago

Pierre-Jean Coquard from AGENIUM Space:

We've encountered a number of issues that can make the data retrieving process more complicated than using a third-party API in some cases. I'm not sure of where I should post this kind of user feedback, so I'm sending it directly to you.

• The size of a downloadable image is limited to 2500 x 2500 pixel, but the bounding box is specified in latitude / longitude. This makes it less obvious to the user whether his request is valid or not. Moreover, in a perspective of creating datasets for machine learning, it could be time-saving to be able to get images just from the centre coordinate of the image and the width and height given in pixel in a single function.

• When downloading a batch of images, their default name is written as "<type>_<acquisition-date>.tif". As a result, downloading multiple images from the same sensor at the same date will result in overwriting the previous image, since they are all named the same. Adding the option of renaming the output image file in the download function instead of having to change its name between each download would help. 

• It would be great to have the possibility to download full Sentinel products in order to process a large scale database.

I understand that a some of these limitations are due to how Sentinel Hub API works. However, I think it might be beneficial to implement wrappers with more options to facilitate the access of the available data.

fmariv commented 4 months ago

Makes sense. Renaming the files is something that can be done very easily with a new parameter, while the first point would require a new wrapper, perhaps giving the centroid and size of the bounding box.

I don't quite understand the last request: what exactly do we mean by full Sentinel product?

fmariv commented 2 months ago

This issue can be closed. In the notebook 11_download_sentinel_imagery the following has been exemplified:

  1. How to create a bounding box from coordinates, giving the width and height in pixels and the pixel size.
    
    from eotdl.tools import bbox_from_centroid

x = 12.7 y = 41.8

custom_bbox = bbox_from_centroid(x=x, y=y, pixel_size=10, width=512, height=512)


2. How to name downloaded images.

This will save the image as Jaca.tif as this only downloads one image

download_sentinel_imagery( output="data/jaca_bulk", time_interval="2020-01-04", bounding_box=jaca_bounding_box, sensor="sentinel-2-l2a", name="Jaca", )

This will save the images as Jaca_<date>.tif as this downloads several images

download_sentinel_imagery( output="data/jaca_bulk", time_interval=("2020-01-01", "2020-01-10"), bounding_box=jaca_bounding_box, sensor="sentinel-2-l2a", name="Jaca", )



The option to download an entire product has not been implemented, since the data access method is based on Sentinel Hub.