appelmar / gdalcubes

Creating and analyzing Earth observation data cubes in R
https://gdalcubes.github.io
Other
120 stars 28 forks source link

Error when trying to access NASA HLS dataset requiring netrc authentication #85

Open grueck opened 1 year ago

grueck commented 1 year ago

I found problems when trying to access NASA's Harmonoszed Landsat Sentinel product through gdalcubes. The downlod URLs are secured via netrc (see https://lpdaac.usgs.gov/products/hlss30v002 ) I figured that for the terra library configuration works like described here https://github.com/rspatial/terra/issues/608 I tried to set gdalcubes_set_gdal_config to these same options plus the nerc options here : https://gdal.org/user/configoptions.html

For rstac, options are passed through the post request, which works. Here is my reproducible example:

library(rstac)
library(sf)
library(gdalcubes)
library(terra)
library(stars)

gdalcubes_set_gdal_config("GDAL_HTTP_NETRC", "YES")
gdalcubes_set_gdal_config("GDAL_HTTP_NETRC_FILE", "/home/ubuntu/.netrc") #replace with your path
gdalcubes_set_gdal_config("GDAL_HTTP_UNSAFESSL", "YES")
gdalcubes_set_gdal_config("GDAL_HTTP_COOKIEFILE", ".rcookies")
gdalcubes_set_gdal_config("GDAL_HTTP_COOKIEJAR", ".rcookies")
gdalcubes_set_gdal_config("GDAL_DISABLE_READDIR_ON_OPEN", "EMPTY_DIR")
gdalcubes_set_gdal_config("CPL_VSIL_CURL_ALLOWED_EXTENSIONS", "TIF")

setGDALconfig(c("GDAL_HTTP_UNSAFESSL=YES", 
                "GDAL_HTTP_COOKIEFILE=.rcookies",
                "GDAL_HTTP_COOKIEJAR=.rcookies", 
                "GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR",
                "CPL_VSIL_CURL_ALLOWED_EXTENSIONS=TIF"))

s = stac("https://cmr.earthdata.nasa.gov/stac/LPCLOUD")
bbox <- c(27.36, -5.153, 30.469, 2.112433)
start <- "2018-08-01T00:00:00Z")
end <- f"2018-08-31T00:00:00Z")
netrc <- "/home/ubuntu/.netrc"  #a sample - replace with your path
#test STAC
items <- s %>%
  stac_search(collections = "HLSS30",
              bbox = bbox,
              datetime = paste(start,end, sep = "/")) %>%
  post_request(config(netrc = TRUE, netrc_file = netrc), set_cookies("LC" = "cookies"))

assets <- c("B02", "B03", "B04")
col <- stac_image_collection(items$features, asset_names = assets, 
                             property_filter = function(x) {x[["eo:cloud_cover"]] < 30})

al <- assets_url(items, asset_names = assets)
#test terra
cog <- rast(paste0("/vsicurl/", al[1]))
mycrs <- crs(cog, describe=T)
epsg <- mycrs$code

ll <- st_sfc(st_point(sabb[1:2]), crs = 4326) %>% st_transform(st_crs(cog))
ur <- st_sfc(st_point(sabb[3:4]), crs = 4326) %>% st_transform(st_crs(cog))

v = cube_view(srs = epsg,  extent = list(t0 = as.character(startDate), t1 = as.character(endDate),
                                         left = ll[[1]][1], right = ur[[1]][1],  top = ur[[1]][2], bottom = ll[[1]][2]),
              dx = 30, dy = 30, dt = "P5D", aggregation = "median", resampling = "bilinear")

gdalcubes_options(parallel = 4) 
bands <- c("B04","B03", "B02")
medianRGB <- function(col, v){
  raster_cube(col, v) %>%
    select_bands(bands) %>%
    reduce_time(c("median(B04)", "median(B03)", "median(B02)"))
}
#test gdalcubes
medianRGB(col, v) %>%
              st_as_stars() -> RGBout
write_stars(RGBout, "out.tif", type = "Int16") 

The error occures at st_as_stars() -> RGBout when gdalcube tries to get the data.

Error in gdalcubes:::gc_exec_worker(j$cube, j$worker_id, j$worker_count,  : 
  c++ exception (unknown reason)
Execution halted
Error in gdalcubes:::gc_exec_worker(j$cube, j$worker_id, j$worker_count,  : 
  c++ exception (unknown reason)
Execution halted
Error in gdalcubes:::gc_exec_worker(j$cube, j$worker_id, j$worker_count,  : 
  c++ exception (unknown reason)
Execution halted
Error in gdalcubes:::gc_exec_worker(j$cube, j$worker_id, j$worker_count,  : 
  c++ exception (unknown reason)
Execution halted
Error in gc_eval_cube(x, fname, .pkgenv$compression_level, with_VRT, .pkgenv$ncdf_write_bounds,  : 
  one or more worker processes failed to compute data cube chunks

When trying a second time, R hangs and the session needs to be aborted. The code works perfectly with S2 data on AWS and I am afraid it has to do with how the netrc authentication is handle in gdalcubes.

grueck commented 1 year ago

I realized an error in my code: appearently, the srs string changed in a recent update and must now be prefixed by "EPSG" when setting srs for cube_view, and this error is uncaught. The problem is now that the download seems to work, but produces an empty image filled with 0's.

cboettig commented 1 year ago

I think you get empty images when authentication fails?

I have found that authentication using bearer tokens with EarthData login is easier to handle than getting that darn netrc mechanism to work. (e.g. https://gist.github.com/cboettig/5401bd149a2a27bde2042aa4f7cde25b)