Problem in the bru prediction formula of a zero inflated poisson model with two random effects:

inlabru-org / inlabru

inlabru

https://inlabru-org.github.io/inlabru/

76 stars 21 forks source link

Problem in the bru prediction formula of a zero inflated poisson model with two random effects: #50

Closed ritacardoso closed 1 year ago

ritacardoso commented 5 years ago

@FinnLindgren As we spoke about this afternoon, I created a model to predict tick abundance in mainland Scotland - It has a zero inflated Poisson distribution (zero inflated Poisson1) with two random effects, the effect of the site of tick collection and the number of samples per site (apart from the fixed effects). The dataset contains information on tick counts obtained from tick surveys in which sites for tick collection were selected over Scotland and in each site several samples were performed. The table contains information on 686 sites with multiple samples in each site, a total of 10611 samples (each sample is a row). The measure of interest is the count of ticks per sample. The model is: Model <- bru(count ~ Intercept + frost + rain + deer + forest + mysmooth + site + sample

I added the influence of the random effects on the prediction formula. However, I am not sure if the formula is correct. e.15 <- predict (model, pixels(mesh), ~ exp(mysmooth + rain + frost + deer + forest + Intercept + site + sample))

The question is: How to write the prediction formula to take in account the variation of the ID effects (number of sites and number of samples?

Thank you very much Rita

finnlindgren commented 5 years ago

This is a good question, and one that neither raw INLA nor inlabru has a good solution to. To do "new site&sample" prediction directly from the model, i.e. with "new" realisations of the site and sample specific random effects, the latent model needs to contain the new sites&sample IDs already when estimating the model. However, this is very expensive and wasteful.

An alternative may be to introduce the randomness in the prediction formula itself, using the model parameters as input (which will be sampled internally by predict. Something like

newdata <- pixels(mesh)
N <- nrow(newdata)
predict(model, newdata, ~ exp(... +
  rnorm(N, sd=site_Precision^-0.5) + rnorm(N, sd=sample_Precision^-0.5)))

I haven't tested this, but the philosophy behind the inlabru predict function I believe should allow this. One needs to figure out what the actual names of the precision parameters are; if the documentation doesn't give hints we'll need to look at the internal code and update the docs.

finnlindgren commented 5 years ago

Note: the covariates need also be defined in newdata, or defined via functions of location.

finnlindgren commented 5 years ago

pixels generates a SpatialPixelsDataFrame which doesnt support nrow but

N <- length(newdata)

seems to be the right thing, and I can confirm that the rnorm approach works. All the names available for use in the formula can be obtained by using

names(generate(model, newdata, n.samples = 1)[[1]])

which should reveal the name of the precision parameters.