Prediction intervals for mixed effects models

SBuckerfield commented 4 years ago

What methods do people use to generate prediction intervals for mixed effects models? I just used 'predictInterval' from the merTools package to produce 95% prediction intervals and didn't exactly get what I was hoping for.

I have two continuous predictors, one categorical variable, and two random effects that allow random intercepts, in this format:

ME<- lmer (Dependent_Variable ~ Categorical_pred+ scale(Continuous_pred1)+scale(Continuous_pred2)+(1|Random1)+(1|Random2))

I am interested in showing the prediction interval for one of the continuous predictors for each of the three levels of the categorical variable, whilst keeping the other continuous predictor constant and taking one group from each of the two random effects which has a mean close to the mean of the all the random effect group means.

I generated a new set of values from which to make predictions: NEW<-expand.grid(Continuous_pred2=8.01, Continuous_pred1=seq(from=0,to=10,by=0.01), Random1="Group7", Random2="Group10", Categorical_pred=c("Cat1","Cat2","Cat3"))

I then used predictInterval to generate the prediction intervals with the new dataset using the model ME:

pred <- predictInterval(ME, NEW, n.sims = 10000, level = 0.95, stat = 'mean')

When I then plot the results of this with the raw data, it looks ok (what I'd expect for a prediction interval) apart from the lines being very 'shaky', when I was hoping for smooth. If I increase the n.sims or discretisation of the continuous predictor in the new dataset from which the predictions are made they become smoother but based on the examples I've looked at I already have a lot of simulations....any advice/observations of some simple step I've missed? There seem to be multiple ways of doing this as I have discovered on stack exchange but this seemed perhaps the most simple, in principle.

jejoenje commented 4 years ago

Hi, there are a few different ways of calculating prediction intervals for GLMMs. I went through a few in the "Predicting from GLMMs" session last year (repo here). With multiple mixed effects, it's far from trivial to decide what is best and it very much depends on what you are trying to do.

Its hard to comment on your specific case without seeing the actual problem (not sure how to interpret "shaky lines"). Could you post a reproducible example?

jejoenje commented 4 years ago

Sorry I've just realised the link above doesn't have what I thought it did - bear with me while I find the right presentation.

jejoenje commented 4 years ago

Here you go this is what I meant: https://github.com/StirlingCodingClub/PredictingGLMMs/blob/master/PredictingGLMMs_md.pdf

StirlingCodingClub / studyGroup

Prediction intervals for mixed effects models #33