r-spatial / stars

Spatiotemporal Arrays, Raster and Vector Data Cubes
https://r-spatial.github.io/stars/
Apache License 2.0
558 stars 94 forks source link

issue using `st_cells()` on a cropped image #685

Closed btupper closed 4 months ago

btupper commented 4 months ago

Hi,

I’m having trouble understanding the output on st_cells() when operating upon a cropped image. Below I crop a large image, and create a subsample of points using the cropped image. Then I run st_cells() on each. The values of the cell indices from the cropped image are unexpected.

First we read in data from the starsdata package.

suppressPackageStartupMessages({
  library(sf)
  library(stars)
  library(dplyr)
})
## Warning: package 'sf' was built under R version 4.3.2

## Warning: package 'stars' was built under R version 4.3.2
# read in data
filename= system.file(paste0("netcdf/", "avhrr-only-v2.19810901.nc"), package = "starsdata")
orig = read_stars(filename, sub = "sst", quiet = TRUE, proxy = FALSE)

Next we create a bounding box, and crop the orig.

bb = st_bbox(c(xmin = 270, ymin = 20, xmax = 300, ymax = 50), crs = st_crs(orig))
crop = st_crop(orig, bb)

Make a set of random points in the cropped domain

x = st_sample(bb, 10)

#plot(crop, axes = TRUE, reset = FALSE)
#plot(x, add = TRUE, pch = 16, col = "orange")

Compare the maximum cell index with the output of st_cells. Note the cells indices are less than the maximum possible cell index.

iorig = st_cells(orig, x)
iorig
##  [1] 263215 400012 385657 297742 266079 286228 352517 307842 266133 264608
mx = prod(dim(orig))
cat("max orig index:", mx, "\n")
## max orig index: 1036800
iorig <= mx
##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

And now the same for the cropped.

icrop = st_cells(crop, x)
icrop
##  [1] 22975 34372 33217 25822 23199 24868 30437 26682 23253 23048
mx = prod(dim(crop))
cat("max cropped index:", mx, "\n")
## max cropped index: 14400
icrop <= mx
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Whoops! They are all larger than largest possible index value. I must be missing something. Why does cropping cause st_cells to return erroneous values?

sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.4.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] dplyr_1.1.4 stars_0.6-5 abind_1.4-5 sf_1.0-16  
## 
## loaded via a namespace (and not attached):
##  [1] vctrs_0.6.5        cli_3.6.2          knitr_1.42         rlang_1.1.3       
##  [5] xfun_0.40          DBI_1.2.2          KernSmooth_2.23-21 generics_0.1.3    
##  [9] glue_1.7.0         htmltools_0.5.5    e1071_1.7-14       fansi_1.0.6       
## [13] rmarkdown_2.21     grid_4.3.1         tibble_3.2.1       evaluate_0.20     
## [17] classInt_0.4-10    fastmap_1.1.1      lifecycle_1.0.4    yaml_2.3.8        
## [21] compiler_4.3.1     pkgconfig_2.0.3    Rcpp_1.0.12        rstudioapi_0.15.0 
## [25] digest_0.6.33      R6_2.5.1           tidyselect_1.2.1   utf8_1.2.4        
## [29] class_7.3-22       pillar_1.9.0       parallel_4.3.1     magrittr_2.0.3    
## [33] tools_4.3.1        proxy_0.4-27       units_0.8-5
edzer commented 4 months ago
(crop1 = st_normalize(crop))
# stars object with 4 dimensions and 1 attribute
# attribute(s):
#           Min. 1st Qu. Median    Mean 3rd Qu. Max. NA's
# sst [°*C] 5.58 24.9875   27.8 25.8206   28.84 30.5 5384
# dimension(s):
#      from  to         offset delta  refsys x/y
# x       1 120            270  0.25      NA [x]
# y       1 120             50 -0.25      NA [y]
# zlev    1   1          0 [m]    NA      NA    
# time    1   1 1981-09-01 UTC    NA POSIXct    
(icrop = st_cells(crop1, x))
#  [1]  4311  9356  3815  8319  6704  9679  5030 14098  3754 10251
icrop <= mx
#  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

crop is a subgrid in the original raster of orig: the offset didn't change, only the row/col offsets do. st_normalize moves offset and resets row/col indices to start form 1. Should this be added to the docs of st_crop, or st_cell?

btupper commented 4 months ago

Oooooooh! Of course! I love st_normalize already and I haven't used it.

I think it would be great to add to the docs of st_crop and st_cells. I can do a PR. What do you think about adding an optional logical argument normalize to run st_normalize?

edzer commented 4 months ago

What do you think about adding an optional logical argument normalize to run st_normalize?

Yes, that would indeed make sense for st_crop.