Excluding fixed effects in model predictions

matt-w-rees commented 2 years ago

Hi Ken,

I'm predicting estimates from a few models to make some custom plots, but having trouble removing the effect of another explanatory variable - it's categorical so I can't just take the mean etc.

Just wanting to confirm there isn't currently a simple way to exclude particular effects (e.g. an argument in the predict function like there is for random effects or by specifying a NA - if not can I put in a suggestion for an enhancement 😉), and ask if you have any recommendations of a work-around?

Cheers, Matt.

kenkellner commented 2 years ago

Hi Matt,

The short answer is no, there isn't a way to completely remove a fixed effect using predict.

I would generally not recommend doing that, as it would create a prediction outside the limits of what is technically possible based on the model. For a random effect, we are saying that the levels observed in the study are only a sample of the possible levels in the population - thus it is valid to assume a new site with a random effect value of 0 so as to "ignore" the random effect in the prediction. For a categorical fixed effect we are assuming that a sample must be one of the specified levels. So we can't create a prediction scenario that somehow does not fall into one of those levels.

I would recommend setting a non-focal categorical covariate to its reference level when generating predictions for plots for other covariates. That's what the built in plot_effects function does.

If you really want to do this, one way would be to use extract to get the posterior distribution(s) for the intercepts and slopes in your submodel of interest, remove the parameters you don't want, and calculate the prediction yourself. But again, I wouldn't recommend it.

If you know of another modeling package that allows this, I'd be willing to reconsider.

Ken

matt-w-rees commented 2 years ago

Thanks so much Ken - that is a very helpful explanation.

Reflecting on this, my categorical variable should really be treated as random effect. (this is actually the vegetation type variable I was referring to when I was asking about nested random effects in the other issue). I tried fitting it as a seperate RE but this seriously increased computation time relative to treating it as a fixed effect, but I should really just suck it up.

I know the 'mgcv' predict function allows exclusion of fixed effects, but I defintely see your point.

Thanks again!

biodiverse / ubms

Excluding fixed effects in model predictions #56