ESPRI-Mod / synda

ESGF Downloader (this is a deprecated repository, the tool has now moved to https://github.com/ESGF/esgf-download)
https://espri-mod.github.io/synda/
21 stars 11 forks source link

Can synda download from CREATE-IP? #128

Closed hot007 closed 4 years ago

hot007 commented 4 years ago

Hi there,

I've noticed that if I look at synda param project the list produced does not include e.g. CREATE-IP. However synda search CREATE-IP returns expected results. But when I attempt to synda install one of these datasets, files are identified and added to the queue, but all return error status. Group membership does not appear to be required to access the files, I can download from the same URLs manually in the browser, but synda get etc fail. e.g.

> synda get https://esgf.nccs.nasa.gov/thredds/fileServer/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/mon/atmos/tasmax/tasmax_Amon_reanalysis_CFSR_197901-201904.nc
Transfer failed with error 1 (did you subscribe to the required role/group ? (e.g. cmip5_research, cordex_research))
Download failed (https://esgf.nccs.nasa.gov/thredds/fileServer/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/mon/atmos/tasmax/tasmax_Amon_reanalysis_CFSR_197901-201904.nc)

Is this because CREATE-IP is not a project known to Synda, or some other reason?

I notice on the CREATE-IP website, this: "Due to an ESGF credentials issue, regular ESGF users may have to either rename their .dodsrc file or comment out all the CURL lines in their ~/.dodsrc file to access this data via OPeNDAP.". Synda does not seem to contain a .dodsrc file but is this relevant?

hot007 commented 4 years ago

Note wget of the same URL works fine, ie it certainly seems to be an issue with synda not being able to get the file(s).

AtefBN commented 4 years ago

Hi @hot007 I can't replicate this on my machine as the link of the file sends 404 back both from browser and wget. Is this issue still relevant?

hot007 commented 4 years ago

Yeah, still relevant, I haven't managed to get synda to replicate any CREATE-IP data. The data node for this data seems to be a bit flakier than most, but right now I can search and download through the ESGF web interface, and wget the URL but not with synda (search works fine, install and get fail).

 synda search -f CREATE-IP.reanalysis.NOAA-NCEP.CFSR.atmos.mon.v20200115
...
new  1.3 GB    CREATE-IP.reanalysis.NOAA-NCEP.CFSR.atmos.mon.v20200115.pr_Amon_reanalysis_CFSR_197901-201909.nc
...

synda install CREATE-IP.reanalysis.NOAA-NCEP.CFSR.atmos.mon.v20200115.pr_Amon_reanalysis_CFSR_197901-201909.nc
[fails]

 synda get https://esgf.nccs.nasa.gov/thredds/fileServer/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/mon/atmos/pr/pr_Amon_reanalysis_CFSR_197901-201909.nc
Transfer failed with error 1 (did you subscribe to the required role/group ? (e.g. cmip5_research, cordex_research))
Download failed (https://esgf.nccs.nasa.gov/thredds/fileServer/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/mon/atmos/pr/pr_Amon_reanalysis_CFSR_197901-201909.nc)

 wget https://esgf.nccs.nasa.gov/thredds/fileServer/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/mon/atmos/pr/pr_Amon_reanalysis_CFSR_197901-201909.nc
--2020-02-14 01:39:12--  https://esgf.nccs.nasa.gov/thredds/fileServer/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/mon/atmos/pr/pr_Amon_reanalysis_CFSR_197901-201909.nc
Resolving esgf.nccs.nasa.gov (esgf.nccs.nasa.gov)... 169.154.195.63, 2001:4d0:2418:2800::a99a:c33f
Connecting to esgf.nccs.nasa.gov (esgf.nccs.nasa.gov)|169.154.195.63|:443... connected.
HTTP request sent, awaiting response... 200 200
Length: 1297968320 (1.2G) [application/x-netcdf]
Saving to: ‘pr_Amon_reanalysis_CFSR_197901-201909.nc’
AtefBN commented 4 years ago

Ok I spent the morning digging around. And I think this is another nail in the coffin of wget in synda. Can you please go to : /home/---/.conda/envs/synda-env/lib/python2.7/site-packages/synda-3.10-py2.7.egg-info/scripts/sdget.sh and alter the line 311 to:
TLS_ONLY=" --secure-protocol=TLSv1_2 " This should do the trick and your download should go on fine. This is an issue of mismatch of the TLS protocol used by the server and synda wget script. If this doesn't work, make sure to run wget -V and that the version of wget is > 1.14. If it's inferior please upgrade. Otherwise it just won't work.

I'm not a fan of how downloads are handled using an auxiliary bash script within synda, it makes errors very very confusing and this is a proof. I already implemented an alternative way where all http downloads are performed using python requests module. Should be released soon with synda-python 3.

Anyway let me know how it goes!

hot007 commented 4 years ago

Thanks Atef. Unfortunately I'm still seeing the same behaviour - though I recall messing about with the TLS setting thinking that was my problem when I was trying to get the RPM install going (gave up, conda is a better option I think!), but I think you could be onto something with that. Anyway, after changing the TLS_ONLY var I still find that synda returns the unhelpful transfer failed error with both synda install and synda get.

2020-02-17 23:46:04,382 INFO SDDMDEFA-102 Transfer failed (sdget_status=1,sdget_error_msg=Transfer failed with error 1 (did you subscribe to the required role/group ? (e.g. cmip5_research, cordex_research)),error_msg='Error occurs during download.',file_id=118,status=error,local_path=/home/---/.synda/data/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/atmos/mon/v20200115/psl_Amon_reanalysis_CFSR_197901-201909.nc,url=https://esgf.nccs.nasa.gov/thredds/fileServer/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/mon/atmos/psl/psl_Amon_reanalysis_CFSR_197901-201909.nc)

> synda get CREATE-IP.reanalysis.NOAA-NCEP.CFSR.atmos.mon.v20200115.psl_Amon_reanalysis_CFSR_197901-201909.nc 1 file(s) will be downloaded for a total size of 508.4 MB. Transfer failed with error 1 (did you subscribe to the required role/group ? (e.g. cmip5_research, cordex_research)) Download failed (https://esgf.nccs.nasa.gov/thredds/fileServer/CREATE-IP/reanalysis/NOAA-NCEP/CFSR/mon/atmos/psl/psl_Amon_reanalysis_CFSR_197901-201909.nc)

And again wget of the same URL works fine.

Out of interest, do the synda commands work for you? e.g. synda install CREATE-IP.reanalysis.NOAA-NCEP.CFSR.atmos.mon.v20200115.psl_Amon_reanalysis_CFSR_197901-201909.nc

AtefBN commented 4 years ago
(synda-from-scratch) [root@synda-dev ~]# synda install CREATE-IP.reanalysis.NOAA-NCEP.CFSR.atmos.mon.v20200115.psl_Amon_reanalysis_CFSR_197901-201909.nc

Search completed.                                                                                   

1 file(s) will be added to the download queue.
Once downloaded, 508.4 MB of additional disk space will be used.

Yes it goes through fine. It is indeed curious the TLS hack hasn't solved the issue, I could download that file on my machine just fine. Wonder what I've missed.

hot007 commented 4 years ago

Hmm. Yes I get that much, but then it goes straight to error state once it tries to actually download the file.

(synda-env) ct5255@slwce-1> synda install CREATE-IP.reanalysis.NOAA-NCEP.CFSR.atmos.mon.v20200115.psl_Amon_reanalysis_CFSR_197901-201909.nc
1 file(s) will be added to the download queue.
Once downloaded, 508.4 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
1 file(s) enqueued
You can follow the download using 'synda watch' and 'synda queue' commands

(synda-env) ct5255@slwce-1> synda queue

status      count  size
done           73  3.1 GB
running         1  508.4 MB
(synda-env) ct5255@slwce-1> synda watch
No current download
(synda-env) ct5255@slwce-1> synda queue

status      count  size
done           73  3.1 GB
error           1  508.4 MB

And just to confirm,

(synda-env) ct5255@slwce-1> cat .conda/envs/synda-env/lib/python2.7/site-packages/synda-3.10-py2.7.egg-info/scripts/sdget.sh | head -313 | tail -6
# Don't check the server certificate against the available certificate authorities.  Also don't require the URL host name to match the common name presented by the certificate.
NO_CHECK_SERVER_CERTIFICATE=" --no-check-certificate "
#NO_CHECK_SERVER_CERTIFICATE=" "
TLS_ONLY=" --secure-protocol=TLSv1_2 "
#TLS_ONLY=" "
g__lifetime=168
AtefBN commented 4 years ago

Thanks for the details @hot007 I have indeed went through with the download using the daemon after the install and it fell into the same old error on my machine as well. Can you check your wget version for me? wget -V If it returns a value < 1.14 then that should be the issue here. As versions prior to 1.14 do not support tls version 1.2. And if this is the case, the fix would be to remove the --secure-protocol from the wget command. (Or update wget to 1.14 or more). Make sure you update the sdget.sh file that resides under $ST_HOME/bin/sdget.sh as that is the script ran by the daemon. Let me know how this goes.

hot007 commented 4 years ago

I'm running wget 1.14, so that ought to be okay.

ct5255@slwce-1> wget -V
GNU Wget 1.14 built on linux-gnu.
... 
hot007 commented 4 years ago

Hi Atef,

Following a tip from a colleague, I tried starting from scratch with an Ubuntu VM, and synda install from CREATE-IP works fine! I had tried upgrading my centOS7 wget to v1.19 but it didn't help. Ubuntu VM runs wget 1.19, synda 3.10, and otherwise effectively the same configuration as before, but installation of CREATE-IP datasets works there. Must be some incompatibility in the OS though I really have no idea what!!

So anyway, my problem is resolved, I guess, though it is still the case that as far as I can tell synda just doesn't work quite right on CentOS.

cheers Claire

AtefBN commented 4 years ago

That is great news, I was running out of ideas as I couldn't replicate the behavior on my machines anymore. Glad it works fine now. Atef.