CS-SI / eodag

Earth Observation Data Access Gateway
https://eodag.readthedocs.io
Apache License 2.0
316 stars 43 forks source link

Issue with creodias provider #312

Open alunacob opened 3 years ago

alunacob commented 3 years ago

Describe the bug When using eodag + creodias provider, the search works, but the download fails inconsistently (at different % of completition for the same file). When finishing (extract: false set in the config), the filename is changed from .SAFE.zip to .SAFE and the product is incomplete and not usable.

Code To Reproduce

eodag -vvv download --conf /tmp/creodias.conf --search-results /tmp/SLC.geojson

Output 1st try:

2021-07-08 14:10:01,901-15s eodag.config                     [INFO    ] (config           ) Loading user configuration from: /tmp/creodias.conf
2021-07-08 14:10:02,581-15s eodag.core                       [DEBUG   ] (core             ) Opening product types index in /root/.config/eodag/.index
2021-07-08 14:10:02,582-15s eodag.core                       [INFO    ] (core             ) Downloading 1 products
Downloading products:   0%|                                                                                                                                      | 0/1 [00:00<?, ?product/s]2021-07-08 14:10:03,005-15s eodag.plugins.download.http      [INFO    ] (http             ) Download url: https://zipper.creodias.eu/download/55cbdcca-aaa9-5fd2-aef3-36d5596bcaa1
 77%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                   | 2.18G/2.82G [29:56<05:08Downloading products: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [36:04<00:00, 2164.8Downloading products: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [36:04<00:00, 2164.82s/product]
Traceback (most recent call last):
  File "/opt/eod/venv/bin/eodag", line 8, in <module>
    sys.exit(eodag())
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/opt/eod/venv/lib/python3.8/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/opt/eod/venv/lib/python3.8/site-packages/eodag/cli.py", line 412, in download
    downloaded_files, downloaded_files_json = satim_api.download_all(search_results)
TypeError: cannot unpack non-iterable NoneType object

2nd try:

eodag -vvv download --conf /tmp/creodias.conf --search-results /tmp/SLC.geojson
2021-07-08 14:10:01,901-15s eodag.config                     [INFO    ] (config           ) Loading user configuration from: /tmp/creodias.conf
2021-07-08 14:10:02,581-15s eodag.core                       [DEBUG   ] (core             ) Opening product types index in /root/.config/eodag/.index
2021-07-08 14:10:02,582-15s eodag.core                       [INFO    ] (core             ) Downloading 1 products
Downloading products:   0%|                                                                                                                                      | 0/1 [00:00<?, ?product/s]2021-07-08 14:10:03,005-15s eodag.plugins.download.http      [INFO    ] (http             ) Download url: https://zipper.creodias.eu/download/55cbdcca-aaa9-5fd2-aef3-36d5596bcaa1
                                                                                                                                                                                           2021-07-08 14:40:10,699-15s eodag.plugins.download.http      [DEBUG   ] (http             ) Download recorded in /mnt/cache/.downloaded/3261f94003115d5abdec56d1ed67898030:07<23:31, 681kB/s]
2021-07-08 14:40:10,962-15s eodag.plugins.download.http      [WARNING ] (http             ) Downloaded product is not a Zip File. Please check its file type before using it
Downloading products: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [30:08<00:00, 1808.38s/product]
Traceback (most recent call last):
  File "/opt/eod/venv/bin/eodag", line 8, in <module>
    sys.exit(eodag())
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/eod/venv/lib/python3.8/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/opt/eod/venv/lib/python3.8/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/opt/eod/venv/lib/python3.8/site-packages/eodag/cli.py", line 412, in download
    downloaded_files, downloaded_files_json = satim_api.download_all(search_results)
TypeError: cannot unpack non-iterable NoneType object
 66%|████████████████████████████████████████████████████████████████████████████████████████████████▏                                                 | 1.86G/2.82G [30:07<15:37, 1.03MB/s]

Environment:

Additional context Add any other context about the bug here.

sbrunato commented 3 years ago

Hello @alunacob , can you please post:

Have you tried to change the destination directory, outputs_prefix in /tmp/creodias.conf, to /tmp or $HOME ?

Thank you.

alunacob commented 3 years ago

Thanks for commenting:

creodias:
  auth:
    credentials:
      password: *************
      username: *****************
  download:
    extract: false
    outputs_prefix: /mnt/cache
  priority: 5
  search: {}

I write the outputs into a mounted volume. I could change this for testing purposes, but in fact it's quite important to store the outputs there to be able to persist them in case of e.g. shutting down the docker container.

Thank you!

alunacob commented 3 years ago

It seems when arriving here: https://github.com/CS-SI/eodag/blob/develop/eodag/plugins/download/http.py#L220

the file is not a valid zip and hence, it gets renamed. Which is a proper behavior because the download is incomplete!

However, the issue then is that the stream download exits the { get chunk -> write to file} iterator somehow before finishing the download (https://github.com/CS-SI/eodag/blob/develop/eodag/plugins/download/http.py#L210).

I have no clue why, though. I've seen requests package version is 2.25.1 which is the latest.

Any hint on what to try/check would be really appreciated.

thanks in advance.

sbrunato commented 3 years ago

@alunacob , thanks for the details. I'll make some tests with v1.5.2 + creodias and see if I also get issues

sbrunato commented 3 years ago

@alunacob, I could not reproduce the issues with v1.5.2 + creodias. It might be related to the mounted volume in your k8s environment. Can you try to change outputs_prefix to /tmp or $HOME ? You can change it temporarily from your python code with:

import os
from eodag import EODataAccessGateway

os.environ["EODAG__CREODIAS__DOWNLOAD__OUTPUTS_PREFIX"] = "/tmp"
dag = EODataAccessGateway()
alunacob commented 3 years ago

Thanks, I will update it and report the outcome

alunacob commented 3 years ago

I have tried your suggestion and the same issue happens. Also, I tried with a fresh installation with the latest eodag version (eodag (Earth Observation Data Access Gateway): version 2.3.1) through CLI, and the download seems to finish successfully, but the downloaded product is .SAFE (not .zip) and incomplete

sbrunato commented 3 years ago

the download seems to finish successfully, but the downloaded product is .SAFE (not .zip) and incomplete

This means that the result is incomplete or corrupted: zipfile.is_zipfile(downloaded_path) must be False, and you must have got the following warning:

Downloaded product is not a Zip File. Please check its file type before using it

I do not understand what could cause the download to be incomplete. We are using streamed download through requests, which should prevent it. Did you try to download from another provider ?

Also is it possible to share your Dockerfile ? Or just a part of it, enough to help reproducing the issue.

Thank you

remi-braun commented 2 years ago

I had the same issue on some random products when downloading all T32TLT L2A tiles yesterday. A vast majority are working, but some don't, really weird. Maybe it's aproblem on creodias side ?

I redownloaded these tiles through AWS and it is working fine.

sbrunato commented 2 years ago

@remi-braun if you have some products ids for which you have issues, can you please post them to help us reproduce the error ? Thanks

remi-braun commented 2 years ago

Here they are 😄

S2A_MSIL2A_20210509T103021_N0300_R108_T32TLT_20210509T133016.SAFE
S2A_MSIL2A_20210618T103021_N0300_R108_T32TLT_20210618T133547.SAFE
S2A_MSIL2A_20210728T103031_N0301_R108_T32TLT_20210728T152601.SAFE
S2A_MSIL2A_20210906T103021_N0301_R108_T32TLT_20210906T151711.SAFE
S2A_MSIL2A_20211016T103031_N0301_R108_T32TLT_20211016T133750.SAFE
S2A_MSIL2A_20211125T103351_N0301_R108_T32TLT_20211125T133031.SAFE
S2B_MSIL2A_20220109T103319_N0301_R108_T32TLT_20220109T121931.SAFE
S2A_MSIL2A_20191226T103431_N0213_R108_T32TLT_20191226T120024.SAFE
S2A_MSIL2A_20190808T103031_N0213_R108_T32TLT_20190808T140801.SAFE
S2A_MSIL2A_20190828T103021_N0213_R108_T32TLT_20190828T164154.SAFE
S2A_MSIL2A_20190917T103021_N0213_R108_T32TLT_20190917T141750.SAFE
S2A_MSIL2A_20191007T103021_N0213_R108_T32TLT_20191007T131042.SAFE
S2A_MSIL2A_20191027T103131_N0213_R108_T32TLT_20191027T120218.SAFE
S2A_MSIL2A_20191116T103311_N0213_R108_T32TLT_20191116T115132.SAFE
S2A_MSIL2A_20191206T103421_N0213_R108_T32TLT_20191206T121006.SAFE