appelmar / gdalcubes

Creating and analyzing Earth observation data cubes in R
https://gdalcubes.github.io
Other
120 stars 28 forks source link

unexpected segfaults from raster_cube()? #92

Open cboettig opened 8 months ago

cboettig commented 8 months ago

The following example throws a segfault on each machine I've tested. (trying to subset from a stac collection of data in netcdf serialization)

library(rstac)
library(gdalcubes)
items <-
  stac("https://planetarycomputer.microsoft.com/api/stac/v1") |>
  stac_search(collections = "esa-cci-lc-netcdf",
              limit = 100) |>
  post_request() |>
  items_sign(sign_fn = sign_planetary_computer()) 

col <- stac_image_collection(items$features, asset_names = c("netcdf"))
extent <- list(left= -124.40959, bottom=32.53416,
               right=-114.13828,  top=42.00952,
               t0 = "1992-07-01", t1 = "2020-07-01") 
cube <- cube_view(srs ="EPSG:4326",
                  extent = extent,
                  nx = 200, ny = 400, dt = "P1Y")
data <-  raster_cube(col, cube)

data |> plot()
appelmar commented 7 months ago

It turned out that raster_cube() tried to read imagery from the raster bands of the netCDF files, which do not expose any raster bands (they can be read from the subdatasets). To avoid segmentation faults, I've added a simple check if there are raster bands at all.

Changing the creation of the image collection as below would make sure that the subdatasets are read instead:

f = function(url) {
  return(paste0("NETCDF:\"/vsicurl/", url, "\":lccs_class"))
}
col <- stac_image_collection(items$features, asset_names = c("netcdf"), url_fun = f)

The more difficult question seems to be, if this should / can be done automatically in stac_image_collection(). It seems that the STAC response uses the data cube extension, which provides more information on the variables, dimensions, etc, but working with nested lists in asset makes this quite complex.