r-spatial / stars

Spatiotemporal Arrays, Raster and Vector Data Cubes
https://r-spatial.github.io/stars/
Apache License 2.0
558 stars 94 forks source link

Enhancement Request: Direct Access to Band Values Using Band Names in stars Object #669

Closed PondiB closed 7 months ago

PondiB commented 7 months ago

I've been working with 4D stars objects, particularly for remote sensing data analysis, and encountered a limitation that impacts the usability and intuitiveness of data manipulation. The issue arises when attempting to access specific bands values by their names in a stars object, which currently does not seem to be directly supported.

Example to Illustrate the Issue:

library(stars)

 # Create a 4D array: dimensions are x, y, band, and time
d = c(x = 4, y = 5, band = 8, time = 3)
m = array(rnorm(prod(d)), dim = d)
# Assign band and time dimensions with meaningful names
a = st_as_stars(m) |>
    st_set_dimensions("band", c("B01","B02","B03", "B04","B05", "B06","B07", "B08")) |>
    st_set_dimensions("time", Sys.Date() + 1:4)

 # Example function attempting to use band names directly
openeo_ndvi = function(datacube, nir = "B08", red = "B04"){
  # Ideal: Directly use 'nir' and 'red' band names to manipulate data
  # Current Issue: Lack of a straightforward way to access bands by name
  fn_ndvi = function(red,nir) (nir - red)/(nir + red)
  ndvi = st_apply(datacube, c("x", "y", "time"), fn_ndvi)
  return(ndvi)
}

In this example, the openeo_ndvi function aims to calculate the NDVI using specific bands identified by their names (e.g., "B08" for NIR and "B04" for Red). However, the current API does not provide a direct method to access or manipulate bands using these names within a multi-dimensional array, complicating the implementation of such functions.

I look forward to any discussion or feedback on this matter.

edzer commented 7 months ago

I think this can be done by first selecting the bands, then working on the selected two:

library(stars)
# Loading required package: abind
# Loading required package: sf
# Linking to GEOS 3.12.1, GDAL 3.8.3, PROJ 9.3.1; sf_use_s2() is TRUE
suppressPackageStartupMessages(library(dplyr))

 # Create a 4D array: dimensions are x, y, band, and time
d = c(x = 4, y = 5, band = 8, time = 3)
m = array(rnorm(prod(d)), dim = d)
# Assign band and time dimensions with meaningful names
a = st_as_stars(m) |>
    st_set_dimensions("band", c("B01","B02","B03", "B04","B05", "B06","B07", "B08")) |>
    st_set_dimensions("time", Sys.Date() + 1:3, point = TRUE)

 # Example function attempting to use band names directly
openeo_ndvi = function(datacube, nir = "B08", red = "B04"){
  # Ideal: Directly use 'nir' and 'red' band names to manipulate data
  # Current Issue: Lack of a straightforward way to access bands by name
  datacube = slice(datacube, "band", c(red, nir))
## or: 
  # datacube = datacube[,,,c(nir,red),]
  fn_ndvi = function(red, nir) (nir - red)/(nir + red)
  dims = setdiff(names(dim(datacube)), "band")
  st_apply(datacube, dims, fn_ndvi)
}

openeo_ndvi(a)
# stars object with 3 dimensions and 1 attribute
# attribute(s):
#               Min.   1st Qu.     Median       Mean   3rd Qu.     Max.
# fn_ndvi  -11.39959 -1.031738 -0.2981643 -0.2802616 0.5756361 8.831886
# dimension(s):
#      from to     offset  delta refsys point x/y
# x       1  4          0      1     NA FALSE [x]
# y       1  5          0      1     NA FALSE [y]
# time    1  3 2024-02-27 1 days   Date  TRUE   

This version

PondiB commented 7 months ago

This version

  • uses slice to select band by name, rather than position, as the [ outcommented alternative would do
  • sets dims to all dimensions except band, so it works for arbitrary dimensioned cubes or cubes with different dimension names

Thanks for this example. This will do.