inlabru-org / inlabru

inlabru
https://inlabru-org.github.io/inlabru/
76 stars 21 forks source link

Handle NAs more clearly for predictions (was: Two questions regarding time knot and interactions) #132

Open joendijk opened 2 years ago

joendijk commented 2 years ago

Hi,

I have two questions related to the same models I have been working on, and mostly comes down to proper coding. I have not done any R-INLA or inlabru workshop, so the knowledge I have picked up so far is through the internet, available literature, and my supervisor.

My first question is about coding a spatio-temporal model with time knots. I use weeks as my scale for time. In INLA I would prepare a one-dimensional mesh: k <- 10 tknots <- seq(min(week) + (0.5nweek)/k, max(week) - (0.5nweek)/k, length=k) mesh.t <-inla.mesh.1d(tknots)

I expected this would be the same with inlabru, but I am not sure whether this step is needed or not.

When I use the above mesh.t as group_mapper: form1 = resp ~ Intercept + ship(fveskey, model = "iid", n = n_veskey) + myspde(coordinates, group = week, group_mapper = bru_mapper(mesh.t, indexed=TRUE), model = inla.spde2.matern(mesh1c), control.group = list(model="ar1"))

I get a different outcome when not using mesh.t, but using bru_mapper_index(10): form1i = resp ~ Intercept + ship(fveskey, model = "iid", n = n_veskey) + myspde(coordinates, group = week, group_mapper = bru_mapper_index(10), model = inla.spde2.matern(mesh1c), control.group = list(model="ar1"))

When I plot both myspde's I see in both cases the time knots, same intervals on same scale x-axis, but different mode and quantiles and summaries and dic and waic are different as well. The first model form has the better values. I am not sure if defining mesh.t is necessary or even correct with inlabru, and would like to understand what the difference is between using bru_mapper(mesh.t) with 10 time knots, and bru_mapper_index(10).

++

My other question is about how to define interactions in the form. e.g. using "*" or ":" with inlabru. From what I've read this should work for inla() just like in glm() for example. However, I'm not sure how to define it for bru(), so far what I have tried doesn't work.

I have tried, but clearly does not work: form3 = resp ~ Intercept(1) + depth + temp + sal + windspeed : windir(fwinddir, model = "rw1", cyclic =TRUE, n = n_winddir) + ship(fveskey, model = "iid", n = n_veskey) + year(cyear, model = "iid", n = ncyear) + myspde(coordinates, group = week, group_mapper = bru_mapper(mesh.t, indexed=TRUE), model = inla.spde2.matern(mesh1c), control.group = list(model="ar1"))

I came across this forum post: https://github.com/inlabru-org/inlabru/discussions/101 and started experimenting. The model runs with it, but I am not at all sure whether it's doing the right thing, or not.

This form works: form3 = resp ~ Intercept(1) + depth + temp + sal + wind(wind.int(windspeed:fwinddir), model = "rw1", cyclic =TRUE, n = n_winddir) + ship(fveskey, model = "iid", n = n_veskey) + year(cyear, model = "iid", n = ncyear) + myspde(coordinates, group = week, group_mapper = bru_mapper(mesh.t, indexed=TRUE), model = inla.spde2.matern(mesh1c), control.group = list(model="ar1"))

In the summary I see the effect of wind, but I'm not sure what happened nor whether this is the correct way to define an interaction.

I hope my questions are clear and someone can advise me. Many thanks in advance!

finnlindgren commented 2 years ago

For time indexed discretely and with no need for interpolation, bru_mapper_index(10) is the easiest and matches basic INLA group model indexing. The inla.mesh.1d approach is useful if you want to be able to interpolate between the modelled time points, or to use non-integer indexing (but note that the group models will treat the distance between each know as equal, so it just affects the indexing itself). The bru_mapper(mesh1d, indexed=TRUE) mapper maps the knots to integer indices, and values in between the knots will be interpolations of the values at the nearest knots. So if your weeks are 1:52, then bru_mapper(inla.mesh.1d(1:52), indexed=TRUE) would be equivalent to bru_mapper_index(52) for the integer values 1 through 52, but "decimal weeks" such as 4.3 would be interpolations (0.7 u(4)+0.3 u(5) for 4.3). If you use a higher order basis function, with inla.mesh1d(..., degree=2), the latent field will control the weights for the basis functions instead of the values at the knots, and all values will be weighted sums of latent variables, including the values at the knots.

finnlindgren commented 2 years ago

For the second question, the and : notation for covariate "interactions" is not supported. In inlabru, the predictor expression is a regular R expression, so is simply plain multiplication, and : is a sequence constructor. I've started thinking of a special component type that could take a formula and/or model matrix, so that the user can have control over fixed effect interactions. However, in your case your're looking for a structured random effects model, so need to use the INLA features for that, with either group (for some dependence across wind speeds) or replicate (independence between wind speeds. For your windspeed&winddir problem, a slight change in your second version should work:

wind(windir, model="rw1", mapper=bru_mapper(winddir_knots, interval=c(0,360)), indexed=FALSE), cyclic=TRUE,
   replicate=windspeed, replicate_mapper = ...)
wind(windir, model="rw1", mapper=bru_mapper(winddir_knots, interval=c(0,360)), indexed=FALSE), cyclic=TRUE,
   group=windspeed, group_mapper = ..., control.group=...)

Both of these will construct a kronecker precision model, just as for your myspde component.

joendijk commented 2 years ago

Thanks again for the answers! Here a follow up question (or actually a few). I went with bru_mapper_index(10), I was curious what the difference would make between independence and dependence across wind speeds, and independence gave the better models. I assumed that group_mapper = ... and control.group=... in the provided examples were optional and left these out (I hope I was correct here), so final models looked like:

form2b = TURclass ~ Intercept(1) +
  depth + temp + sal +
  wind(fwinddir, model = "rw1", mapper = bru_mapper_index(8), # 8 cardinal/intercardinals numbered 1 through 8
       cyclic = TRUE, replicate = windspeed) +
  ship(fveskey, model = "iid", n = n_veskey) +
  year(cyear, model = "iid", n = ncyear) +
  myspde(coordinates, group = week, group_mapper = bru_mapper_index(10),
         model = inla.spde2.matern(mesh1c), control.group = list(model="ar1"))

I got interesting results and decided to run predictions. And this is my main question here and relates to bru_mapper_index(10). Since I defined these time knots I wanted to make 10 gradient plots, similarly to the inlabru tutorial "08 Spatio-temporal modelling". However, I wasn't sure how the time knots were defined in the bru model, I scanned through mod2b_nb$bru_info, and I thought group would be correct since it has a range of 1 to 10, but this didn't really work. What did work was week:

df <- pixels(mesh1c)
df1 <- SpatialPixelsDataFrame(coordinates(df), 
                              data.frame(week = rep(1, nrow(coordinates(df)))))

logint1 = predict(object=mod2b_nb, data=df1, formula= ~(Intercept + myspde))

ggplot() + gg(logint1, mapping = aes_string(fill = "mean")) + csc

The above example gives the gradient plot for week 1, or time knot 1, as I wasn't sure what was being predicted, until I tried predicting week 11 and 52, in addition to week 1 through 10. For 1 through 10 I get gradient plots as seen in the tutorial, for week 11 and 52, interestingly enough R does run the predict() successfully, but the gradient plot is a single colour. So I guess what goes on in the model function with bru_mapper_index(10) week numbers that are indexed as 1 are replaced with a 1? Still, I do not understand why predict() then doesn't throw me an error when I want a prediction for week 11 or 52. I mean, when all week numbers are replaced with index numbers 1 through 10, it shouldn't be able to compute non-existing index numbers. Or am I wrong here?

Looking forward to your answer. Many thanks in advance!

finnlindgren commented 2 years ago

Looking at the inlabru:::ibm_amatrix.bru_mapper_index, it detects values that cannot be mapped and excludes them from the calculations, so that the corresponding effect becomes zero. This can probably be changed to at least a warning message; this will be part of a general update on some of the mapper code to handle NA values in a predictable&useful way as well.