Open Rapsodia86 opened 6 months ago
Thanks for the report! That's definitely not a good user experience.
I think the thing for us to focus on here is improving the messaging, since we can't control the stability of the server(s) we're downloading from. How does the user know what it means when their download reaches 100% with errors in the log? Were those files retried and successfully downloaded? Or did the job complete with errors, never downloading some files? In this case, it was the latter, but @Rapsodia86 had to do their own investigation to learn that.
Some sort of summary message at the end would be really valuable. It should list out the URLs that failed so the user can investigate with the provider. E.g. f"Failed to download the following granule URLs within {retry_count} attempts.\n{urls}\n\nPlease contact the data provider ({provider_support_email}) to report errors or instability."
. Can we get the provider support email out of CMR metadata? Alternately, perhaps the call to earthaccess.download()
should raise an error in this case. @Rapsodia86 what would be your preference as a user?
Hi @mfisher87,
thanks for taking care of this.
When a requests.exceptions
occurs, how many retires/attempts are there? That summary would be helpful! Also, maybe a log file with a list of failed urls?
I know that if I rerun the earthaccess.download()
, the files that exist will be skipped (Btw. is the filename only checked or the file size as well?).
However, that gives an option to instead of running it again, I would just upload the log file and run the earthaccess.download()
on that file. What do you think about it? Too much?
We currently have no retry attempts, the only "smart" thing earthaccess
does is that if a granule already exist in the target path it won't try to download it again. We should implement a more robust mechanism to keep track of errors and retries. I like that behavior @mfisher87!
Maybe we can consider these separate features? E.g.:
earthaccess.download(..., retries=10)
. Maybe defaults to a small number like 1
retry.
Hello again, I wanted to download ECOSTRESS LST data, but I have been getting:
requests.exceptions.HTTPError: 502 Server Error: Bad Gateway
halfway through downloading a file.Here is a snippet:
And just a part of the output. I do not have any problems with those specific files when downloading directly from https://search.earthdata.nasa.gov/
Getting 454 granules, approx download size: 3.04 GB QUEUEING TASKS | : 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3632/3632 [00:00<00:00, 66708.32it/s] PROCESSING TASKS | : 31%|████████████████████████████████████████████████████████████████████████ | 1119/3632 [07:28<37:24, 1.12it/s]Error while downloading the file ECOv002_L2T_LSTE_27122_009_16TFM_20230418T182710_0710_01_height.tif Traceback (most recent call last): File "C:\Users\monikat\AppData\Local\miniconda3\envs\earthaccess\lib\site-packages\earthaccess\store.py", line 607, in _download_file r.raise_for_status() File "C:\Users\monikat\AppData\Local\miniconda3\envs\earthaccess\lib\site-packages\requests\models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_27122_009_16TFM_20230418T182710_0710_01/ECOv002_L2T_LSTE_27122_009_16TFM_20230418T182710_0710_01_height.tif
PROCESSING TASKS | : 31%|████████████████████████████████████████████████████████████████████████▏ | 1120/3632 [07:28<30:44, 1.36it/sE rror while downloading the file ECOv002_L2T_LSTE_27122_009_16TFM_20230418T182710_0710_01_LST.tif Traceback (most recent call last): File "C:\Users\monikat\AppData\Local\miniconda3\envs\earthaccess\lib\site-packages\earthaccess\store.py", line 607, in _download_file r.raise_for_status() File "C:\Users\monikat\AppData\Local\miniconda3\envs\earthaccess\lib\site-packages\requests\models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_27122_009_16TFM_20230418T182710_0710_01/ECOv002_L2T_LSTE_27122_009_16TFM_20230418T182710_0710_01_LST.tif
PROCESSING TASKS | : 31%|████████████████████████████████████████████████████████████████████████▎ | 1122/3632 [07:28<20:18, 2.06it/sE rror while downloading the file ECOv002_L2T_LSTE_27137_010_16TFN_20230419T173850_0710_01_water.tif Traceback (most recent call last): File "C:\Users\monikat\AppData\Local\miniconda3\envs\earthaccess\lib\site-packages\earthaccess\store.py", line 607, in _download_file r.raise_for_status() File "C:\Users\monikat\AppData\Local\miniconda3\envs\earthaccess\lib\site-packages\requests\models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_27137_010_16TFN_20230419T173850_0710_01/ECOv002_L2T_LSTE_27137_010_16TFN_20230419T173850_0710_01_water.tif
PROCESSING TASKS | : 31%|████████████████████████████████████████████████████████████████████████▎ | 1123/3632 [07:29<18:13, 2.29it/sE rror while downloading the file ECOv002_L2T_LSTE_27137_010_16TFN_20230419T173850_0710_01_cloud.tif Traceback (most recent call last): File "C:\Users\monikat\AppData\Local\miniconda3\envs\earthaccess\lib\site-packages\earthaccess\store.py", line 607, in _download_file r.raise_for_status() File "C:\Users\monikat\AppData\Local\miniconda3\envs\earthaccess\lib\site-packages\requests\models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_27137_010_16TFN_20230419T173850_0710_01/ECOv002_L2T_LSTE_27137_010_16TFN_20230419T173850_0710_01_cloud.tif
And then, when the download is finished, it shows like all files have been downloaded correctly: PROCESSING TASKS | : 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3632/3632 [25:04<00:00, 2.41it/s] COLLECTING RESULTS | : 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3632/3632 [00:00<00:00, 1208832.89it/s]
But in the download folder, I do have 3533 files.