melff / mclogit

mclogit: Multinomial Logit Models, with or without Random Effects or Overdispersion
http://melff.github.io/mclogit/
22 stars 4 forks source link

mblogit model comparison with anova does not calculate degrees of freedom from random parameters #19

Closed vjilmari closed 2 years ago

vjilmari commented 3 years ago

Thanks for the great package!

I ran into a minor problem when comparing models. I tried to run two models, one with random intercepts and other with random intercepts and slopes. Model comparison with "anova" did calculate the deviance difference correctly, but there seems to be something wrong with degrees of freedom. It seems to only include the fixed formula in this calculation. Below is an example


library(mclogit)

set.seed(21385)
n=2000

d<-data.frame(
  DV=as.factor(sample(c("A","B","C"),n,replace=T)),
  gender=sample(c(-0.5,0.5),n,replace=T),
  cntry=sample(letters[1:10],n,replace=T))

#with random intercept
fit.1<-mblogit(DV~gender,random=~1|cntry,data=d)
summary(fit.1)

#with random slope
fit.2<-mblogit(DV~gender,random=~1+gender|cntry,data=d)
summary(fit.2)

#anova reads same number of degrees of freedom for both models
anova(fit.2,fit.1,test="Chisq")
melff commented 3 years ago

You are right, the degrees of freedom reported are based only on the variables in the fixed part of the model. But I think it is not easy to determine the correct degrees of freedom for comparisons of models with different random structures. The literature I am aware of does not yield any guidelines in this respect. I have not thought about such tests yet, becausemblogit() uses a rather crude Laplace approximation so I would not trust likelihood ratio tests, anyway. What would you expect to be the "correct" degree of freedom for the test and why?

vjilmari commented 3 years ago

Thanks for the response!

You are absolutely correct. I forgot about the Laplace approximation. I (and perhaps many others) who are used to LRT between models in lmer or glmer (not necessary trustworthy with random effects in these scenarios either, but there are parametric bootstrap methods like PBmodcomp to help with that at least in lmer) are unfortunately going to look for such solution without thinking too deeply about it.

Is there another way to compare different random effect structures? Is there a difference between MQL and PQL in such comparisons?

Should there be a warning in mclogit anova in situations where only the random effect part is different between the models to discourage people to manually calculate LRT based on their idea of degrees of freedom? The idea of degrees of freedom would probably come directly from using lmer (just calculating how many more parameters there are in the random effect covariance matrix).

melff commented 2 years ago

In release 0.9.4 using anova() on models with random effects now leads a warning to be issued.