JenniNiku / gllvm

Generalized Linear Latent Variable Models
https://jenniniku.github.io/gllvm/
48 stars 19 forks source link

AIC model selection VS Likelihood Ratio Tests #106

Closed RodolfoPelinson closed 1 year ago

RodolfoPelinson commented 1 year ago

Hello. First of all, thank you for this great package that has been helping me a lot with the analysis of community data.

I am currently analyzing community data from an experiment where I have a fully-factorial design of two factors with three levels each (agrochemical treatment and spatial isolation). So I wish to test whether I have additive or interactive effects of these two treatments on community data. Something like: community ~ treatments * isolation

So I first ran the following models and compared them using AICc (I am omitting some arguments just to make the code cleaner). Also, I am using zero latent variables because the AIC value for this model was by far the smallest one if compared to 1, 2, 3, 4, and 5 latent variables (all tested with the full model). model_1 <- gllvm(formula = ~ 1, num.lv = 0, n.init = 10, seed = 1:10) model_2 <- gllvm(formula = ~ treatments, num.lv = 0, n.init = 10, seed = 1:10) model_3 <- gllvm(formula = ~ isolation, num.lv = 0, n.init = 10, seed = 1:10) model_4 <- gllvm(formula = ~ treatments + isolation, num.lv = 0, n.init = 10, seed = 1:10) model_5 <- gllvm(formula = ~ treatments * isolation, num.lv = 0, n.init = 10, seed = 1:10)

When I compare those models using AICc (using the function AICc() from the gllvm package), this is the result: model_1 = 1783.451 model_2 = 1777.148 model_3 = 1778.979 model_4 = 1776.965 model_5 = 1804.377

In a cleaner model selection table (this is from a function that I designed myself):

                   model     AICc delta_AICc df nobs
1              No Effect 1783.451  6.4863018 14  315
2             Treatments 1777.148  0.1829908 28  315
3              Isolation 1778.979  2.0145949 28  315
4 Isolation + Treatments 1776.965  0.0000000 42  315
5 Isolation * Treatments 1804.377 27.4121353 70  315

It seems that the most plausible model is the one that only includes additive effects of both factors. Considering the rule that models with delta AICc < 2 are equally plausible, I would say that the most plausible model is actually the one that only includes the effects of treatments, since it das delta AICc < 2 and is a simpler model. Also, the model that includes the interaction is by far the less plausible model.

However, if I do likelihood ratio tests using the anova() function from the gllvm package, these are the results:

Testing the effects of treatments:

> anova(model_3, model_4)
Model  1 :  ~ treatments 
Model  2 :  ~ treatments + isolation 
  Resid.Df        D Df.diff    P.value
1      287  0.00000       0           
2      273 35.78408      14 0.00112398

Testing the effects of isolation:

> anova(model_2, model_4)
Model  1 :  ~ isolation 
Model  2 :  ~ treatments + isolation 
  Resid.Df        D Df.diff    P.value
1      287  0.00000       0           
2      273 37.61569      14 0.00059483

Testing the effects of the interaction:

> anova(model_4, model_5)
Model  1 :  ~ treatments + isolation 
Model  2 :  ~ treatments * isolation 
  Resid.Df        D Df.diff   P.value
1      273  0.00000       0          
2      245 56.04616      28 0.0012698

It basically says that all main effects and the interaction are significant and with very small p-values.

I understand that model selection using AIC is not supposed to return exactly the same results (qualitatively) as likelihood ratio tests. I am also aware that the anova() function only returns approximate p-values when df.diff is larger than 20. Still, these results are strongly discrepant. Usually, I don`t have such conflicting results between AIC and LRT in univariate models. I wonder if I am not doing something wrong. If not, do you know what is probably happening? Any guesses on what approach should I use?

Thank you!

BertvanderVeen commented 1 year ago

Thanks for your question; there is no reason why anova and AICc should correspond here (though ideally they would of course).

In general I would suggest caution in doing extensive model-selection with either LRT or AIC in models with mixed-effects (or generally, actually). LRT does not penalize for the number of parameters, AICc does, and a model is likely to improve if you throw in a number of parameters equal to the number of species! As the anova function says, please do not rely on it when the different in the number of parameters is so large (as is the case here).

I am very hesitant on making a recommendation what you should do, as there are many different opinions and thoughts on the subject of model-selection (and how to do it in mixed-effects models). I would personally go with AICc here, or "just" rely on the model where all terms are included and comparing the statistical uncertainties of parameter estimates, avoiding model-selection altogether.

As an aside, the seed argument is ignored when you specify n.init, as the seeds are randomly sampled internally.