DHI-GRAS / zamwis-fd

ZAMWIS Floods and Droughts workflow
0 stars 0 forks source link

TRMM download fails #8

Closed j08lue closed 6 years ago

j08lue commented 6 years ago
(fd) C:\zamwis_fd>call flooddrought download_trmm --extent "18.375,36.5,-20.25,-8.875" C:\zamwis_fd\\data\TRMM\TRMM.nc    
flooddrought.ingestion.fdchain - Initializing FD input file
flooddrought.ingestion.fdchain_earthdata - Downloading data for 2018-01-22 ...

Downloading:   0%|          | 0/1 [00:00<?, ?item/s]
Downloading: 100%|##########| 1/1 [00:01<00:00,  1.81s/item]
flooddrought - OSError: [Errno -51] NetCDF: Unknown file format: b'C:\\zamwis_fd\\data\\TRMM\\downloads\\3B42RT_Daily.20180122.7.nc4'
j08lue commented 6 years ago

We need to skip these broken files on the server so they do not become a show stopper. Perhaps with a flag so the user can choose whether he can live with an incomplete time series.

j08lue commented 6 years ago

It turns out these are not broken files. The server just reponds with

<!DOCTYPE html><html ng-app="uuiApp" ng-controller="metaCtrl" class="bg-deep-blue"><head><base href="/"><link rel="stylesheet" href="css/vendor.css"><link rel="stylesheet" href="css/main.css"><title ng-if="datasetProvider.searchConstraints.keywords.length == null">GES DISC</title><span ng-if="datasetProvider.searchConstraints.keywords.length != null"><title>{{datasetProvider.searchConstraints.keywords | uppercase}} Data holdings at the GES DISC</title><meta name="description" content="{{datasetProvider.searchConstraints.keywords.join(' ') | uppercase}}, Unified User Interface at NASA, UUI, data, GES DISC, datasets"></span><script src="https://cdn.earthdata.nasa.gov/tophat2/tophat2.js" id="earthdata-tophat-script" data-show-status="false" data-status-api-url="https://status.earthdata.nasa.gov/api/v1/notifications" async="async" defer="defer"></script><meta name="fragment" content="!"><script> window.prerenderReady = false; </script></head><body ng-model-options="{ timezone: 'UTC' }" style="overflow-x:hidden;"><div ui-view=""></div><spinner></spinner><status-list></status-list><back-to-top show-offset="300" fade-offset="1200"></back-to-top><script src="lib/MathJax/MathJax.js?config=TeX-MML-AM_CHTML"></script><script src="lib/ckeditor/ckeditor.js"></script><script src="scripts/vendor.js"></script><script src="scripts/app.js"></script><script language="javascript" id="_fed_an_ua_tag" src="https://dap.digitalgov.gov/Universal-Federated-Analytics-Min.js?agency=NASA&sub-agency=GSFC&sp=data,find,giovanni,visualize,GESDISC&dclink=true" defer="">
  </script></body></html>

Which gets written into a .nc. file.

It works with my personal user credentials, so it must be related to the zamwisfd user.

j08lue commented 6 years ago

I wonder why our raise_for_status does not catch that

https://github.com/DHI-GRAS/earthdata_download/blob/72cb5ba4ff6315d8f5387feddf10435f8a37cb86/earthdata_download/download.py#L76

-- why is the status OK, NASA?

j08lue commented 6 years ago

The fix was to authorize the NASA GESDISC DATA ARCHIVE (nasa_gesdisc_data_archive) application on https://urs.earthdata.nasa.gov/users/zamwisfd/authorized_apps . I had authorized GES DISC, but that was apparently not the right one.

j08lue commented 6 years ago

This is fixed for the scope of zamwis_fd.

The issue that we did not catch the failing download is raised here: https://github.com/DHI-GRAS/earthdata_download/issues/15