CS-SI / eodag

Earth Observation Data Access Gateway
https://eodag.readthedocs.io
Apache License 2.0
316 stars 43 forks source link

extract downloaded STAC assets #391

Open zbenta opened 2 years ago

zbenta commented 2 years ago

Describe the bug While using eodag containers, stac-server does not extract the products if we use the download link generated by the server. If we use a simple python script inside the stac-server container, the downloaded products are extracted automatically.

Code To Reproduce We have explicitly defined eodag to extract extract: true the products as per the documentation

eodag.yaml

creodias:
  priority: 100 # Lower value means lower priority (Default: 0)
  search:   # Search parameters configuration
  download:
      extract: true
      outputs_prefix: /data/out/
      dl_url_params: provider=creodias
  auth:
      credentials:
          username: CREODIAS_USERNAME
          password: CREODIAS_PWD

The script we used to download the products is based o the eodag documentation

from eodag import EODataAccessGateway

dag = EODataAccessGateway()
products, total_count = dag.search(
   productType="S2_MSI_L1C",
   start="2018-01-01",
   end="2018-01-02",
   geom=(1, 43, 2, 44)
)
product_paths = dag.download_all(products)

Environment:

Additional context Even if we use wget or curl to download the data from our stac-sever using the generated url we only get the zip files and there is no automatic extraction as per the eodag.yaml file.

sbrunato commented 2 years ago

Hello @zbenta , and thanks for reporting this issue.

Using eodag as STAC proxy for creodias data will make creodias zipped products available as single STAC assets.

This way using wget or curl to download will give you zipped data, that you'll have to extract separately.

One thing that could be done, but which is not available yet for eodag, would be to:

Would this match your needs ?

zbenta commented 2 years ago

Hi @sbrunato, thanks for your feedback. I think that an automatic extraction of the assets, would solve our issue, this way we can use the data in our pipeline without having to do any manual interaction.