rspatial / terra

R package for spatial data handling https://rspatial.github.io/terra/reference/terra-package.html
GNU General Public License v3.0
543 stars 90 forks source link

Unexpected results when terra::app called on SpatRaster containing strings #1566

Open shuysman opened 4 months ago

shuysman commented 4 months ago

If the cells of a SpatRaster contain strings, unexpected results are returned when terra::app is used to perform math operations on the cells. In addition, terra::values() returns integers from 1:ncell instead of the string values, which may be related.

library(terra)
vals <- c("321", "62", "888", "1", "foo", "bar")
(test <- rast(matrix(vals, nrow = 3, ncol = 2)))
#class       : SpatRaster 
#dimensions  : 3, 2, 1  (nrow, ncol, nlyr)
#resolution  : 1, 1  (x, y)
#extent      : 0, 2, 0, 3  (xmin, xmax, ymin, ymax)
#coord. ref. :  
#source(s)   : memory
#categories  : label 
#name        : lyr.1 
#min value   :     1 
#max value   :   foo 
values(test)
#    lyr.1
#[1,]     2
#[2,]     1
#[3,]     3
#[4,]     6
#[5,]     4
#[6,]     5
values(terra::app(test, fun = \(i) i * 10))
#     lyr.1
#[1,]    20
#[2,]    10
#[3,]    30
#[4,]    60
#[5,]    40
#[6,]    50

If I try to perform a mathematical operation with a string and a number, I would expect an error:

"2" * 10
#Error in "2" * 10 : non-numeric argument to binary operator

I'm not sure if this is intended behavior, but I was bitten by this when I tried to multiply values in a SpatRaster by 10, with values that were typed as strings which I wasn't aware of, and received a SpatRaster with values that I did not expect instead of an error. Examining the SpatRaster with print() and plot() also did not show that the SpatRaster contained strings, so I did not catch the bug in my code until way later in the pipeline.

Debian 12 R version 4.4.1 (2024-06-14) -- "Race for Your Life"

packageVersion("terra")
#[1] ‘1.7.78’
terra::gdal(lib="all")
#    gdal     proj     geos 
# "3.6.2"  "9.1.1" "3.11.1" 
shuysman commented 3 months ago

Looking at my issue again, I noticed that the print() output for the SpatRaster does show that is a categorical raster, which this seems like it might be the intended behavior for. Categorical rasters are not a feature I was familiar with before this. My confusion resulted from loading malformed input containing strings that should have been numeric, causing Terra to load it as a categorical raster.

rhijmans commented 3 months ago

The SpatRaster is as intended. And note

as.data.frame(test)
#  lyr.1
#1   321
#2     1
#3    62
#4   foo
#5   888
#6   bar

It is not obvious whether app should give an error here instead of using the "raw" integer representation of the categorical variable (factor). I think it should give an error, and only use the integer values if specifically directed to do so (e.g. with raw=TRUE) as in subst. That logic would affect many other methods as well (e.g. test + 1); I need some time to think about that.