inlabru-org / inlabru

inlabru
https://inlabru-org.github.io/inlabru/
73 stars 21 forks source link

SpatRaster support not working properly #169

Closed silasprincipe closed 1 year ago

silasprincipe commented 1 year ago

Hi! The SpatRaster support added on the last inlabru version was a good call (very useful!), but I think that it's not working properly (at least not on the same way as the Spatial*DataFrame do).

For example, let's supose I have an SP object with two layers named temp and sal. Then I can construct the components like:

~ temp(my_sp_object, model = "linear") + sal(my_sp_object, model = "linear") + Intercept(1)

And inlabru would correctly find the temp or sal layers in the SP object in a bru call [at least in the previous versions].

Now, if I have a SpatRaster with both layers and use:

~ temp(my_SpatRaster, model = "linear") + sal(my_SpatRaster, model = "linear") + Intercept(1)

It will not work, returning the error: Error: [subset] no valid variable name found. Adding main_layer = "temp" does not help either. But if I construct the components calling the SpatRaster layer directly it will work as intended:

~ temp(my_SpatRaster$temp, model = "linear") + sal(my_SpatRaster$sal, model = "linear") + Intercept(1)

I believe that a support for SpatRasters with multiple layers in the same way as the sp object would be perfect if possible.

finnlindgren commented 1 year ago

I think the implementation was meant to work the same in that respect. In particular, setting main_layer="temp" was definitely meant to work. Are you able to construct a small test file&code so I can test it?

finnlindgren commented 1 year ago

Does ~ temp(eval_spatial(my_SpatRaster, .data., layer="temp"), ...) work?

silasprincipe commented 1 year ago

Hi Finn! I think I just discovered the problem. It's on the eval_spatial.SpatRaster code.

On this part:

  if (!inherits(where, "SpatVector")) {
    where <- terra::vect(where)
  }
  val <- terra::extract(
    data,
    where,
    ID = FALSE,
    layer = layer
  )
}

If where is a SpatialPoints, terra will convert it to a SpatVector, but using layer on the extract function will look for layer on the y argument, that is, the SpatialVector (I know it's confusing, but it's on the terra docs... I believe that this was not what was intended).

Getting just this part, but supplying a data.frame as the where object in terra::extract will work:

val <- terra::extract(
    data,
    where
  )

But then it's not possible to add the layer argument. If I can suggest a modification:

val <- terra::extract(
    env[[layer]],
    where,
    ID = FALSE
  )

 return(val)

Will work just fine (layer can be either a name or a numeric indicating which layer).

silasprincipe commented 1 year ago

Oh Finn, just a reproducible example of the problem, so you can see that it's not working properly (but I believe that the previous suggestion would tackle the problem properly):

library(inlabru)
library(INLA)
library(terra)

# Load the Gorilla data
data(gorillas, package = "inlabru")

covs <- c(rast(gorillas$gcov$vegetation), rast(gorillas$gcov$elevation))

# Define SPDE prior
matern <- INLA::inla.spde2.pcmatern(gorillas$mesh,
                                    prior.sigma = c(0.1, 0.01),
                                    prior.range = c(0.01, 0.01)
)

# Define domain of the LGCP as well as the model components (spatial SPDE
# effect and Intercept)
cmp <- coordinates ~ 
  elev(covs, model = "linear", main_layer = "elevation")+
  mySmooth(coordinates, model = matern) +
  Intercept(1)

# Fit the model (with int.strategy="eb" to make the example take less time)
fit <- lgcp(cmp, gorillas$nests,
            samplers = gorillas$boundary,
            domain = list(coordinates = gorillas$mesh),
            options = list(control.inla = list(int.strategy = "eb"))
)
finnlindgren commented 1 year ago

Thanks! The right solution might be a bit tricky, as the code was intended to allow a different layer to be extracted for each data point, to allow space-time data to be handled sensibly, which wouldn't work with env[[layer]]. But I think some previous code solved that by extracting only one layer at a time; slower, but would solve the only-single-layer-allowed problem. I think there's a similar issue with the stars data model.

Rereading the terra documentation,

   layer: character or numeric to select the layer to extract from for
          each geometry. If ‘layer’ is a character it can be a name in
          ‘y’ or a vector of layer names. If it is numeric, it must be
          integer values between ‘1’ and ‘nlyr(x)’

I see that it seems to be a hybrid of the inlabru selector and layer arguments, which is probably why I misinterpreted it a bit. So if the inlabru main_layer is numeric, it will do what we want, but if one wants to refer to a layer in x by name, one has to store it in a named column of y? That seems odd. Looking at the actual terra::extract code for SpatRaster,SpatVector, if layer is length 1, it extracts that column from y, and uses that as the layer vector for x. If layer is longer, it uses it directly as the layer vector into x. I think that means that the problem may be only for the length(layer)==1 case, and we can fix it in eval_spatial by expanding it to a vector of length nrow(where) before calling terra::extract. Except that it would then still break if one only wants to extract a value at a single point. I think that should be classified as a logic or design bug in terra::extract! The extract behaviour for length(layer)==1 is precisely the reason the inlabru selector argument exists; it does something different than the layer argument.

finnlindgren commented 1 year ago

So a "safer" solution could be to setup the where argument more specifically, and always set a special layer column there, so that layer would always refer to that column. But that would be relying on what I think is really a bug/misfeature, so I'm not entirely happy with that thought...

finnlindgren commented 1 year ago

Example showing it works almost as intended when layer is a vector; the issue I see is the elevation comes out as character, but maybe that's due to how you constructed the covs variable?

library(inlabru)
#> Loading required package: sp
library(INLA)
#> Loading required package: Matrix
#> Loading required package: foreach
#> Loading required package: parallel
#> This is INLA_22.10.23 built 2022-10-23 18:30:23 UTC.
#>  - See www.r-inla.org/contact-us for how to get help.
#>  - To enable PARDISO sparse library; see inla.pardiso()
library(terra)
#> terra 1.6.17

# Load the Gorilla data
data(gorillas, package = "inlabru")

covs <- c(rast(gorillas$gcov$vegetation), rast(gorillas$gcov$elevation))

y_vege <- eval_spatial(covs, where = gorillas$nests,
                       layer = rep("vegetation", nrow(gorillas$nests)))
y_elev <- eval_spatial(covs, where = gorillas$nests,
                       layer = rep("elevation", nrow(gorillas$nests)))
y_mix <- eval_spatial(covs, where = gorillas$nests,
                      layer = rep(c("vegetation", "elevation"),
                                  c(2, nrow(gorillas$nests)-2)))

y_vege[1:4]
#> [1] "Disturbed" "Primary"   "Primary"   "Disturbed"
y_elev[1:4]
#> [1] "2008" "1699" "1872" "1678"
y_mix[1:4]
#> [1] "Disturbed" "Primary"   "1872"      "1678"
finnlindgren commented 1 year ago

In the inlabru model definition, one should be able to do

cmp <- coordinates ~
  elev(covs, model = "linear", main_layer = rep("elevation", nrow(.data.)))+
  mySmooth(coordinates, model = matern) +
  Intercept(1)

but due to the character output from the raster that fails. But that's definitely due to how the code above constructs the raster; when both rasters have a numeric, this modified code works, as long as nrow(.data.) > 1. I'm more convinced now that the terra::extract behaviour for length(layer)==1 was a design error on their part.

silasprincipe commented 1 year ago

Hi Finn! I can confirm that the above code (main_layer = rep("elevation", nrow(.data.))) works just fine. But maybe is still a good idea to try a more elegant solution that does that repetition automatically.

What if you change the extract_layer code, so it will generate the repetition? For example, if one try:

layer <- inlabru:::extract_layer(my_points, layer = NULL, selector = NULL)

It will return a vector of 1s == nrow(my_points). But instead, if one supply:

layer <- inlabru:::extract_layer(my_points, layer = "my_layer", selector = NULL)

It will return a single character "my_layer". So maybe if you change the code for:

extract_layer <- function(where, layer, selector) {
  if (!is.null(layer) && !is.null(selector)) {
    warning("Both layer and selector specified. Ignoring selector",
            immediate. = TRUE
    )
    selector <- NULL
    if (length(layer) == 1) {
      layer <- rep(layer, NROW(where))
    }
  } else if (!is.null(selector)) {
    layer <- extract_selector(where, selector)
  } else if (is.null(layer) && is.null(selector)) {
    layer <- rep(1, NROW(where))
  } else if (!is.null(layer) && is.null(selector)){
    layer <- rep(layer, NROW(where))
  }
  layer
}

Adding this additional else if will make the function produce a vector of layer names that will fit the terra::extract (of course, I don't know if this affect the behavior you expect when working with space-time data!).

extract_layer(my_points, "my_layer", NULL)
# > [1] "my_layer" "my_layer" "my_layer"
finnlindgren commented 1 year ago

Needed a slight modification of that, since it should only do the rep() if layer has length 1:

extract_layer <- function(where, layer, selector) {
  if (!is.null(layer)) {
    if (!is.null(selector)) {
      warning("Both layer and selector specified. Ignoring selector",
              immediate. = TRUE
      )
    }
    selector <- NULL
  } else if (!is.null(selector)) {
    layer <- extract_selector(where, selector)
  } else if (is.null(layer)) {
    layer <- 1
  }
  if (length(layer) == 1) {
    layer <- rep(layer, NROW(where))
  }
  layer
}

Also added special case to eval_spatial.SpatRaster() for the nrow(where)==1 case, to avoid that special case in the terra::extract():

if ((NROW(where) == 1) && (terra::nlyr(data) > 1)) {
    # Work around issue in terra::extract() that assumes `layer` to point
    # to a column of `where` (like `selector`) when
    # length(layer)==1 (NROW(where)==1),
    # but otherwise be used directly for indexing into data.
    # When nlyr == 1, terra:extract ignores the layer input.
    val <- terra::extract(
      data,
      rbind(where, where),
      ID = FALSE,
      layer = c(layer, layer)
    )
    val <- val[1, , drop = FALSE]
  } else {
    val <- terra::extract(
      data,
      where,
      ID = FALSE,
      layer = layer
    )
  }
silasprincipe commented 1 year ago

Hi Finn! I just tested the devel version and it worked perfectly! Many thanks for solving that issue! I'm sure that this added support to SpatRaster will be very helpful.

Should I close the issue now?

finnlindgren commented 1 year ago

No, leave it open; these kinds of bugs I leave open but marked with the label "solved-in-devel", so they stay visible until the next CRAN release; the hope is that this will minimise the number of duplicate bug reports.