inlabru-org / inlabru

inlabru
https://inlabru-org.github.io/inlabru/
73 stars 21 forks source link

Investigate whether to allow copy models to inherit main/group/replicate expressions (was: Error in A %*% state : not-yet-implemented method for <dgCMatrix> %*% <NULL> after run the model) #147

Open orrortega opened 2 years ago

orrortega commented 2 years ago

Hi,

I have been working with a joint LGCP spatio-temporal model and suddenly I have started to get the following message when the model finish the process:

Error in A %*% state : 
  not-yet-implemented method for <dgCMatrix> %*% <NULL>

I thought that maybe it is because I was doing something wrong when I have added the continuous covariates for the "cp" process, but without them I am still having the same problem.

Thanks in advance,

Oscar

Please find below the part of the model just in case I am doing something wrong:

# Marked Point Process IID Model
cmp_iid_joint <- ~ -1 + point_field(coordinates, 
                                    group=Season, 
                                    group_mapper=bru_mapper_index(73), 
                                    model=matern, 
                                    control.group=list(model="iid")) + 
  mark_field(coordinates, ## keyword to refer to coordinates of SpatialPointsDataFrame
            group = Season, ## group field by temporal index (named ti in data column)
            group_mapper = bru_mapper_index(73), ## there are 5 levels in our temporal index
            model = matern, ## refer back to SPDE defined above
            control.group = list(model="iid", hyper=prior.rho)) + ## AR1 temporal structure with prior on rho
  Inter_point(1) + Inter_mark(1) + ## Intercepts for each likelihood
  scaling_latent(main = 1, model = "linear", mean.linear = 0, prec.linear = 1, n=1, values=1) + ## scaling parameter beta with priors
  MODIS_Freq_3500m + MODIS_FRP_Sum_3000m + MODIS_DistFire_50m + Days_SinceMegafire + Months_SinceMegafire + Years_SinceMegafire +
  Month + Effort ## covariates (column names)

ips <- ipoints(domain = mesh, samplers = boundary)
ips <- cprod(ips, data.frame(Year = seq_len(73)))

lik1_ar1 <- like("cp", ## cox process likelihood
                 formula = coordinates ~ point_field + Inter_point, ## model formula for this likelihood
                 include = c("point_field","Inter_point"), ## which model components to include
                 data = df, ## fitting to ’point’ dataset
                 ips = ips) ## where did sampling take place

lik2_ar1 <- like("poisson", ## poisson likelihood
                 formula = total ~ Inter_mark + point_field*qexppnorm(scaling_latent, rate = prior_rate) + mark_field +
                   MODIS_Freq_3500m + MODIS_FRP_Sum_3000m + MODIS_DistFire_50m + Days_SinceMegafire + Months_SinceMegafire + Years_SinceMegafire +
                   Month + Effort, ## model formula for this likelihood
                 include = c("Inter_mark","point_field", "mark_field", "MODIS_Freq_3500m",  
                             "MODIS_FRP_Sum_3000m", "MODIS_DistFire_50m",            "Days_SinceMegafire" ,         
                             "Months_SinceMegafire",  "Years_SinceMegafire","scaling_latent"),
                 data = df) ## fitting to ’mark’ dataset
fit_ar1_joint <- bru(cmp_iid_joint, 
                     lik1_ar1, 
                     lik2_ar1, 
                     options=list(verbose=TRUE)) ## bring together model components & likelihoods and fit model
ASeatonSpatial commented 2 years ago

Nothing jumped out at me when reading the code. Can you reproduce this error with simulated data and share a reprex?

Also what is qexpporm? Is the idea to have an N(0,1) variable, then take pnorm of this, and then inverse CDF to of exponential to get an exponential variable? When I've done something like this in the past I've done something like

  u_prior = list(prec = list(initial = 0, fixed = TRUE))
  cmp = ~ -1 + u(rep(1, n), model = "iid", hyper = u_prior)

and then transformed u as needed in the formula.

I have also had some issues with multiple likelihoods and intercept parameters, it's not always easy for inlabru to figure out how long to make the vector of ones. You might want to try something like main = rep(1:nrow(.data.), using the .data. keyword, assuming here that nrow() returns something sensible for a spatial points data frame (would double check that, sp can surprise sometimes).

None of these ideas seem very related to that error message though so a reprex would be really helpful.

orrortega commented 2 years ago

Hi Andy,

thanks for your response, I have attached data that I can share. In this case there is also a layer that I am trying to add in the cp.

Thanks so much

Oscar

Please find below the code and data:

coordinates(data)<-c("Longitude","Latitude")

elev <- raster("~/Dropbox/Data/alt/altitude.grd")
elev <- as(elev, 'SpatialGridDataFrame')
elev$elevation <- elev$altitude - mean(elev$altitude, na.rm = TRUE)
slope <- raster("~/Dropbox/Data/slope/slope.grd")
slope <- as(slope, 'SpatialGridDataFrame')

## Split integration points by temporal index
ips <- ipoints(domain = mesh, samplers = m)
ips <- cprod(ips, data.frame(Year = seq_len(2)))
plot(ips)

matern <- inla.spde2.pcmatern(mesh,
                              prior.range = c(100, 0.5),
                              prior.sigma = c(2, 0.05))

f.elev <- function(x, y) {
  # turn coordinates into SpatialPoints object:
  # with the appropriate coordinate reference system (CRS)
  spp <- SpatialPoints(data.frame(x = x, y = y), proj4string = fm_sp_get_crs(elev))
  proj4string(spp) <- fm_sp_get_crs(elev)
  # Extract elevation values at spp coords, from our elev SpatialGridDataFrame
  v <- over(spp, elev)
  if (any(is.na(v$elevation))) {
    v$elevation <- inlabru:::bru_fill_missing(elev, spp, v$elevation)
  }

  return(v$elevation)
}

f.slope <- function(x, y) {
  # turn coordinates into SpatialPoints object:
  # with the appropriate coordinate reference system (CRS)
  spp <- SpatialPoints(data.frame(x = x, y = y), proj4string = fm_sp_get_crs(slope))
  proj4string(spp) <- fm_sp_get_crs(slope)
  # Extract elevation values at spp coords, from our elev SpatialGridDataFrame
  v <- over(spp, slope)
  if (any(is.na(v$slope))) {
    v$slope <- inlabru:::bru_fill_missing(slope, spp, v$slope)
  }
  return(v$slope)
}

# Marked Point Process IID Model all covariates + random effects
cmp_iid_joint <- ~ -1 + point_field(coordinates, 
                                    group=ti, 
                                    group_mapper=bru_mapper_index(2), 
                                    model=matern, 
                                    control.group=list(model="ar1")) + 
  mark_field(coordinates, 
             group=ti, 
             group_mapper=bru_mapper_index(2), 
             model=matern, 
             control.group=list(model="ar1"))  +
  Inter_point(1) + Inter_mark(1) + point_field_copy(copy = "point_field", fixed = FALSE) + 
  elev(map=f.elev(x, y), model = "linear")+ 
  slope(map=f.slope(x, y), model = "linear")

lik1_iid <- like("cp",
                 formula = coordinates ~ .,
                 data = data,
                 ips = ips,
                 include = c("point_field", "Inter_point"))

lik2_iid <- like("poisson",
                 formula = counts ~ .,
                 data = data,
                 exclude = c("point_field", "Inter_point"))

fit_iid_joint <- bru(cmp_iid_joint, 
                     lik1_iid, 
                     lik2_iid, options=list(verbose=TRUE))

Chrodatatest.csv slope.zip alt.zip

orrortega commented 2 years ago

Hi! I just realised, trying to run the code again and comparing, that my problem appears in the joint model predictions!

pred_iid_joint <- predict(fit_iid_joint, 
  data, ~ exp(Inter_mark +
                 point_field +
                 point_field_copy +
                 mark_field +
                 elev  + slope),
                 n.samples = 100)
finnlindgren commented 2 years ago

In https://github.com/inlabru-org/inlabru/issues/147#issuecomment-1148075754, the two likelihoods use the same components, and e.g. point_field_copy isn't used at all. In the predict() call, you can only use components that are well defined for the given data, so usually one would only involve either point_field or point_field_copy, but not both. Just as for the like() definitions, use include= or exclude= in the predict() call to make sure it only tries to evaluate components that are well defined from the provided data. I'm not sure if something like that is also the initial problem, but it would definitely be a problem for the stated predict() call in https://github.com/inlabru-org/inlabru/issues/147#issuecomment-1149663443

finnlindgren commented 2 years ago

@orrortega Can you supply a runnable version of the code, including all the data loading etc, any and all library() calls, etc? I tried adding

library(inlabru)
library(tidyverse)
library(raster)
data <- read_csv("Chrodatatest.csv")

to your code above to be able to load the data, but then it fails due to the missing mesh object, and there might be more things missing later on. (The code needs to be able to run in a "clean" R session.)

orrortega commented 2 years ago

@finnlindgren Thanks for this!

I just came from a conference and I have tried just including point_field and with include/exclude in predict() and it is working.

Thanks so much for all your work and support!

I have attached the clean code below.

library(inlabru)
library(tidyverse)
library(raster)
library(maptools)
library(rgeos)
library(sp)

data <- read_csv("Chrodatatest.csv")
## Spatial information
ldn <- readShapePoly("~/contourisland/contourisland.shp") 
elev <- raster("~/alt/altitude.grd")
elev <- as(elev, 'SpatialGridDataFrame')
elev$elevation <- elev$altitude - mean(elev$altitude, na.rm = TRUE)
slope <- raster("~/slope/slope.grd")
slope <- as(slope, 'SpatialGridDataFrame')

m = gUnaryUnion(ldn)

coordinates(data)←c("Longitude","Latitude")
maxedge<-2*diff(range(data$Longitude))/15
mesh<-inla.mesh.2d(boundary=m, max.edge=maxedge,cutoff = 50)

## Split integration points by temporal index
ips <- ipoints(domain = mesh, samplers = m)
ips <- cprod(ips, data.frame(Year = seq_len(2)))
plot(ips)

matern <- inla.spde2.pcmatern(mesh,
                              prior.range = c(100, 0.5),
                              prior.sigma = c(2, 0.05))

f.elev <- function(x, y) {
  # turn coordinates into SpatialPoints object:
  # with the appropriate coordinate reference system (CRS)
  spp <- SpatialPoints(data.frame(x = x, y = y), proj4string = fm_sp_get_crs(elev))
  proj4string(spp) <- fm_sp_get_crs(elev)
  # Extract elevation values at spp coords, from our elev SpatialGridDataFrame
  v <- over(spp, elev)
  if (any(is.na(v$elevation))) {
    v$elevation <- inlabru:::bru_fill_missing(elev, spp, v$elevation)
  }
  return(v$elevation)
}

f.slope <- function(x, y) {
  # turn coordinates into SpatialPoints object:
  # with the appropriate coordinate reference system (CRS)
  spp <- SpatialPoints(data.frame(x = x, y = y), proj4string = fm_sp_get_crs(slope))
  proj4string(spp) <- fm_sp_get_crs(slope)
  # Extract elevation values at spp coords, from our elev SpatialGridDataFrame
  v <- over(spp, slope)
  if (any(is.na(v$slope))) {
    v$slope <- inlabru:::bru_fill_missing(slope, spp, v$slope)
  }
  return(v$slope)
}

f.aspect <- function(x, y) {
  # turn coordinates into SpatialPoints object:
  # with the appropriate coordinate reference system (CRS)
  spp <- SpatialPoints(data.frame(x = x, y = y), proj4string = fm_sp_get_crs(aspect))
  proj4string(spp) <- fm_sp_get_crs(aspect)
  # Extract elevation values at spp coords, from our elev SpatialGridDataFrame
  v <- over(spp, aspect)
  if (any(is.na(v$aspect))) {
    v$slope <- inlabru:::bru_fill_missing(aspect, spp, v$aspect)
  }
  return(v$slope)
}

# Marked Point Process IID Model all covariates + random effects
cmp_iid_joint <- ~ -1 + point_field(coordinates, 
                                    group=ti, 
                                    group_mapper=bru_mapper_index(2), 
                                    model=matern, 
                                    control.group=list(model="ar1")) + 
  mark_field(coordinates, 
             group=ti, 
             group_mapper=bru_mapper_index(2), 
             model=matern, 
             control.group=list(model="iid"))  +
  Inter_point(1) + Inter_mark(1) + point_field_copy(copy = "point_field", fixed = FALSE) + 
  elev(map=f.elev(x, y), model = "linear")+ 
  slope(map=f.slope(x, y), model = "linear")

lik1_iid <- like("cp",
                 formula = coordinates ~ .,
                 data = data,
                 ips = ips,
                 include = c("point_field", "Inter_point", "elev", "slope","aspect"))

lik2_iid <- like("poisson",
                 formula = counts ~ .,
                 data = data,
                 exclude = c("point_field", "Inter_point","elev"))

fit_iid_joint <- bru(cmp_iid_joint, 
                     lik1_iid, 
                     lik2_iid, options=list(verbose=TRUE))

summary(fit_iid_joint)

predjoint <- predict(fit_iid_joint,
                     data,
                     ~ (Inter_mark +
                          point_field +
                          mark_field +
                          elev + aspect + slope
                     ),
                     n.samples=1000)

pxl <- pixels(mesh, mask=m)
pxl <- cprod(pxl, data.frame(ti=seq_len(2)))

#####Point Fields####
predpoint <- predict(fit_iid_joint,
                     pxl,
                     ~ point_field , 
                     include=c("point_field", "Inter_point", "elev", "slope","aspect") 
                     ,n.samples=1000)
#####Mark Fields#####
predmark <- predict(fit_iid_joint,
                    pxl,
                    ~ mark_field , include=c("mark_field")
                    ,n.samples=1000)

contourisland.zip

finnlindgren commented 2 years ago

There are a couple of problems with the code, relating to old behaviour, as well as point process logic, but when I fixed those I discovered what seems to be a possible bug or undocumented behaviour for copy models with group. I'll check it out and let you know.

finnlindgren commented 2 years ago

Code with various corrections&clarifications. In particular, "copy" components currently must be given explicit input expressions; they do not (by default) access the same variables as the original components. The actual INLA estimation seems to be a bit unstable; it detected a problem with the hessian calculations and restarted, so I haven't run it long enough to check the predict calls, but they look like they should be ok.

library(INLA)
library(inlabru)
library(tidyverse)
library(raster)
library(maptools)
library(rgeos)
library(sp)

data <- read_csv("Chrodatatest.csv")
## Spatial information
ldn <- readShapePoly("border/contourisland.shp")
elev <- raster("alt/altitude.grd")
elev <- as(elev, 'SpatialGridDataFrame')
elev$elevation <- elev$altitude - mean(elev$altitude, na.rm = TRUE)
slope <- raster("slope/slope.grd")
slope <- as(slope, 'SpatialGridDataFrame')

m = gUnaryUnion(ldn)

# Coordinate names in the data are Longitude and Latitude, but
# are clearly actually UTM coordinates!
# To make sure to avoid confusion (and to make sure we're not accidentally
# relying on old behaviour that sometimes renamedF coordinates to x and y,
# let's copy them to Easting and Northing and use that instead everywhere.

data$Easting <- data$Longitude
data$Northing <- data$Latitude

coordinates(data) <- c("Easting", "Northing")
maxedge<-2*diff(range(data$Longitude))/15
mesh<-inla.mesh.2d(boundary=m, max.edge=maxedge,cutoff = 50)

## Split integration points by temporal index
ips <- ipoints(domain = mesh, samplers = m)
ips <- cprod(ips, data.frame(Year = seq_len(2)))

ggplot() +
  gg(mesh) +
  gg(ips, aes(size=weight)) +
  gg(m)

matern <- inla.spde2.pcmatern(mesh,
                              prior.range = c(100, 0.5),
                              prior.sigma = c(2, 0.05))

f.elev <- function(x, y) {
  # turn coordinates into SpatialPoints object:
  # with the appropriate coordinate reference system (CRS)
  spp <- SpatialPoints(data.frame(x = x, y = y), proj4string = fm_sp_get_crs(elev))
  proj4string(spp) <- fm_sp_get_crs(elev)
  # Extract elevation values at spp coords, from our elev SpatialGridDataFrame
  v <- over(spp, elev)
  if (any(is.na(v$elevation))) {
    v$elevation <- inlabru:::bru_fill_missing(elev, spp, v$elevation)
  }
  return(v$elevation)
}

f.slope <- function(x, y) {
  # turn coordinates into SpatialPoints object:
  # with the appropriate coordinate reference system (CRS)
  spp <- SpatialPoints(data.frame(x = x, y = y), proj4string = fm_sp_get_crs(slope))
  proj4string(spp) <- fm_sp_get_crs(slope)
  # Extract elevation values at spp coords, from our elev SpatialGridDataFrame
  v <- over(spp, slope)
  if (any(is.na(v$slope))) {
    v$slope <- inlabru:::bru_fill_missing(slope, spp, v$slope)
  }
  return(v$slope)
}

f.aspect <- function(x, y) {
  # turn coordinates into SpatialPoints object:
  # with the appropriate coordinate reference system (CRS)
  spp <- SpatialPoints(data.frame(x = x, y = y), proj4string = fm_sp_get_crs(aspect))
  proj4string(spp) <- fm_sp_get_crs(aspect)
  # Extract elevation values at spp coords, from our elev SpatialGridDataFrame
  v <- over(spp, aspect)
  if (any(is.na(v$aspect))) {
    v$slope <- inlabru:::bru_fill_missing(aspect, spp, v$aspect)
  }
  return(v$slope)
}

# Create Year-index:
data$Year <- as.numeric(as.factor(data$year))

# Marked Point Process IID Model all covariates + random effects
# Note: "copy" components must be informed where they should be evaluated,
# and do _not_ inherit that information from the original component.
# In the future, this might change, e.g. by giving an explicit error message,
# or explicitly using the same expressions as the original (by default).
cmp_iid_joint <- ~ -1 + point_field(coordinates,
                                    group=Year,
                                    group_mapper=bru_mapper_index(2),
                                    model=matern,
                                    control.group=list(model="ar1")) +
  mark_field(coordinates,
             group=Year,
             group_mapper=bru_mapper_index(2),
             model=matern,
             control.group=list(model="iid"))  +
  Inter_point(1) + Inter_mark(1) +
  point_field_copy(coordinates, copy = "point_field", group = Year, fixed = FALSE) +
  elev(f.elev(Easting, Northing), model = "linear")+
  slope(f.slope(Easting, Northing), model = "linear")

# point process over space, but varying intensity in time
lik1_iid <- like("cp",
                 formula = coordinates + Year ~ .,
                 data = data,
                 ips = ips,
                 include = c("point_field", "Inter_point", "elev", "slope","aspect"))

lik2_iid <- like("poisson",
                 formula = counts ~ .,
                 data = data,
                 exclude = c("point_field", "Inter_point","elev"))

fit_iid_joint <- bru(cmp_iid_joint,
                     lik1_iid,
                     lik2_iid, options=list(verbose=TRUE))

summary(fit_iid_joint)

predjoint <- predict(fit_iid_joint,
                     data,
                     ~ (Inter_mark +
                          point_field +
                          mark_field +
                          elev + aspect + slope
                     ),
                     n.samples=1000)

pxl <- pixels(mesh, mask=m)
pxl <- cprod(pxl, data.frame(Year=seq_len(2)))
# To ensure consistent coordinate names:
coordnames(pxl) <- coordnames(data)

#####Point Fields####
predpoint <- predict(fit_iid_joint,
                     pxl,
                     ~ point_field ,
                     include=c("point_field", "Inter_point", "elev", "slope","aspect")
                     ,n.samples=1000)
#####Mark Fields#####
predmark <- predict(fit_iid_joint,
                    pxl,
                    ~ mark_field , include=c("mark_field")
                    ,n.samples=1000)
orrortega commented 2 years ago

Thanks so much for your help Finn.

I know that the spatial variables (elevation, slope and aspect) are not helping much, and probably should be excluded, as the island is too small the plant doesn't care about the physical variables.

I will run the whole code and check the predictions.

Thanks again,

Oscar

orrortega commented 2 years ago

Hi @finnlindgren,

I was having issues with the predictions with the joint model (mark and point fields are working fine, except with an issue with the coordinates. Running this

predjoint <- predict(fit_iid_joint,
                     data,
                     ~ (Inter_mark +
                          point_field +
                          mark_field +
                          elev + aspect + slope
                     ),
                     n.samples=1000)

I obtained the following error:

Error in Inter_mark + point_field + mark_field + elev + aspect : 
  non-numeric argument to binary operator
In addition: Warning messages:
1: In proj4string(obj) : CRS object has comment, which is lost in output
2: In proj4string(obj) : CRS object has comment, which is lost in output

I have changed the predict to have a similar estructure than the mark and the point:

predjoint <- predict(fit_iid_joint,
                     data,
                     ~ (Inter_mark +
                          point_field +
                          mark_field),
                     include=c("point_field", "Inter_mark",  "mark_field", "elev", "slope","aspect"),
                     n.samples=1000)

And now I only have the issue with the projections but it looks fine.

Cheers

Oscar

finnlindgren commented 2 years ago

The CRS warnings are usually ignorable; the way sp implemented PROJ6 support unfortunately isn't as seamless as one could hope for.

Using include to make sure it only tries to evaluate components that can be evaluated using the given data is how it's meant to work.

orrortega commented 2 years ago

Thanks so much for all your help