mjwoods / RNetCDF

Read and write netcdf format in R
Other
24 stars 9 forks source link

OpenDAP access for password protected files #20

Open ashiklom opened 6 years ago

ashiklom commented 6 years ago

OpenDAP is a great way to remotely access large NetCDF datasets.

The current version of RNetCDF works with OpenDAP links for un-authenticated URLs. For example, for GHCN CAMS data from the IRI Data Library, this just works:

library(RNetCDF)
# Example URL to IRI Data Library GHCN_CAMS data 
url <- "http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCEP/.CPC/.GHCN_CAMS/.gridded/.deg0p5/.temp/dods"
nc <- open.nc(url)
print.nc(nc)
close.nc(nc)

However, some URLs, including those of the UCAR Research Data Archive (RDA) or the NASA GES DISC are password protected. Currently, my approach for accessing these services is to manually build the subset URL, then download the resulting NetCDF file and read it from disc. For example, for some MERRA outputs:

# Base URL for MERRA at this date
url_base <- "https://goldsmr4.gesdisc.eosdis.nasa.gov:443/opendap/MERRA2/M2I1NXASM.5.12.4/1983/08/MERRA2_100.inst1_2d_asm_Nx.19830801.nc4"
# Add NCDF4 output tail (not a typo -- the ending is `.nc4.nc4`)
url_nc4 <- paste0(url_base, ".nc4")
# Add the subset query for wind U component (variable U2M)
url <- paste0(url_nc4, "?U2M[0:1:23][0:1:1][0:1:1]")

tmp <- tempfile()

library(RNetCDF)
library(curl)
library(magrittr)

h <- new_handle() %>%
  handle_setopt(
    followlocation = TRUE,
    username = my_user,
    password = my_pass
  )

curl_download(url, tmp, handle = h)

nc <- open.nc(tmp)
print.nc(nc)
close.nc(nc)

...or, using the crul package...

library(crul)
http <- HttpClient$new(
  url = url,
  auth = auth(user = my_user, pwd = my_pass)
)
result <- http$get()
tmp2 <- tempfile()
writeBin(result$content, tmp2)
nc <- RNetCDF::open.nc(tmp2)

However, it would be great if there was a way to more directly access the NetCDF file and subset it through R in a way analogous to the non-password-protected services like the first example.

The underlying issue here is that RNetCDF::open.nc only supports simple URLs as connections. However, if it could be modified to work with full HTTP requests generated by curl or crul, I think everything else should work the same.

mjwoods commented 6 years ago

That's an interesting use case. I haven't tried password protected OpenDAP sites myself.

Behind the scenes, RNetCDF simply passes your URL to the NetCDF C library, which creates the connection to OpenDAP. If your data source works with ncdump, it should work with RNetCDF.

I found some documentation for setting up passwords for ncdump at http://docs.opendap.org/index.php/DAP_Clients_-_Authentication#ncdump. Would you mind trying these instructions to see if they work for you with RNetCDF?