hypertidy / ncapi

4 stars 1 forks source link

Earthdata credentialed access #4

Open btupper opened 2 years ago

btupper commented 2 years ago

I'm not sure that this is the right venue, but I have been admiring and appreciating all of your effort. So, I thought, "hey, let's see if Mikey likes it!" (sorry, that television advert might date me.)

PO.DAAC is moving its resources to the cloud and moving it's cataloging to a web-based Common Metadata Repository aka "cmr". I think this means no more THREDDS, but I am fuzzy on that. I have started a package to navigate cmr - finding resources seems to work well (and quickly!) But once one finds the URL for the resource, the cloud based data appears to require credentials (via a temporary token obtained with username:password.) None of the trusty opendapy R packages based upon the NetCDF c-API seem to be able to access these reosurces because, well, the API's nc_open expects a local filename or a URL neither of which should include the token. Booo.

There are some tutorials available, but I can't seem to translate them in to R to work. It feels like I need to create a credentialed session-context, and then open the resource. But deep down what gets passed to nc_open(x) still has to be just the path or URL without credentials attached. At this point I am so muddled that up is down and ice cream tastes like broccoli.

Anyway, I am wondering how you see this playing out for ncapi and it's opendap friends?

mdsumner commented 2 years ago

oh thanks, definitely interested - I'll have a look

mdsumner commented 2 years ago

fwiw Ben Raymond has this in our downloader kit which might help with established examples just for trying


I wondered yesterday if RNetCDF was being too aggressive wanting a file to exist ...while GDAL was able to open a link I had

btupper commented 2 years ago

My inner sloth wants to continue to leverage opendap ala RNetCDF/ncdf4 rather than the wget/curl request-then-download route. But I see that bowerbird takes care of all of the details, and that appeals to my inner sloth, too.

mdsumner commented 2 years ago

bowerbird is dope for automating downloads, we couldn't live without it now

possibly you can put the token on the end on the .nc url with question mark and &?

mdsumner commented 2 years ago

did you see this?


btupper commented 2 years ago

I vaguely recall trying variants of url.nc?token=foobar but recall no joy. I should circle back on that.

This is interesting - I'll give it a shake.

btupper commented 2 years ago

I'm getting no joy. I can get/delete tokens but after that nada. From what I can see the C API nc_open wants an unadorned path.

It looks like two step (a) the download the subset to file and (b) then open with a reader is the path forward. I suppose that is a good model, but my inner laziness really appreciates the simplicity of an unfettered OPeNDAP session without needing an intermediary file.

mdsumner commented 2 years ago

oh absolutely, it's got to be possible - a bit outside my skill though, probably can see how gdal does it. I'll try to remember to look tomorrow 🙏

mdsumner commented 2 years ago

so, you need at least .netrc and a cookiefile according to this GDAL stuff


the .netrc part doesn't seem to like special chars in my password ... so I'll keep trying a few things

mdsumner commented 2 years ago

here Even provides some details, I don't know what the cookiefile should consist of


btupper commented 2 years ago

I have one (as well as .netrc and .dodsrc). The ~/.cookies file has this...

btupper@ecocast ~ $ more .cookies
# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

#HttpOnly_opendap.earthdata.nasa.gov    FALSE   /   FALSE   0   JSESSIONID  DF24...F0801
mdsumner commented 2 years ago

can't believe it! I got something

gdalinfo https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/VIIRS_N20-STAR-L3U-v2.80/20220509201000-STAR-L3U_GHRSST-SSTsubskin-VIIRS_N20-ACSPO_V2.80-v02.0-fv01.0.nc --config GDAL_HTTP_COOKIEFILE /tmp/cookies.txt --config GDAL_HTTP_COOKIEJAR /tmp/cookies.txt 

this is going to be good, thanks! I'll see it we can do this via netcdf, but gdal has gotten good enough for most of my uses

mdsumner commented 2 years ago

it doesn't seem to need the .netrc either

mdsumner commented 2 years ago

oh, yes it does - it must have been cached, or something - still this is very promising

mdsumner commented 2 years ago

ah whoops, make sure you use /vsicurl for GDAL


gives subdatasets



class       : SpatRaster 
dimensions  : 9000, 18000, 1  (nrow, ncol, nlyr)
resolution  : 0.02, 0.02  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 
source      : 20220509201000-STAR-L3U_GHRSST-SSTsubskin-VIIRS_N20-ACSPO_V2.80-v02.0-fv01.0.nc:sst_front_position 
varname     : sst_front_position (Binary SST front position indicator) 
name        : sst_front_position 
time        : 2022-05-09 20:10:01 

I tried setting the cookies thing as env vars and attempt to rely on the .netrc for NetCDF itself but no joy

mjwoods commented 8 months ago

Hi @btupper and @mdsumner , sorry for my late entry into this discussion. I came across some information that may be helpful for you at https://opendap.github.io/documentation/tutorials/ClientAuthentication.html#dodsrc .