lrtest gives incorrect results

jerlich commented 2 years ago

I created a gist to show the issue and provide a MWE https://gist.github.com/jerlich/9620fcf758a3557f580981e61f357e38

In a nutshell lrtest gives inconsistent results with ftest in GLM.jl and with results from R.

ftest(m1r.model, m1.model)
F-test: 2 models fitted on 29 observations
───────────────────────────────────────────────────────────────
     DOF  ΔDOF     SSR     ΔSSR      R²     ΔR²      F*   p(>F)
───────────────────────────────────────────────────────────────
[1]    3        0.1452           0.8186                        
[2]    5     2  0.1101  -0.0351  0.8624  0.0438  3.9818  0.0315
───────────────────────────────────────────────────────────────

lrtest(m1r.model, m1.model)
Likelihood-ratio test: 2 models fitted on 29 observations
──────────────────────────────────────────────
     DOF  ΔDOF  Deviance  ΔDeviance  p(>Chisq)
──────────────────────────────────────────────
[1]    3          0.1452                      
[2]    5     2    0.1101    -0.0351     0.9826
──────────────────────────────────────────────

jerlich commented 1 year ago

I did some digging, and it seems to be the lrtest is incorrectly using the difference in deviance between nested models, when it should be using the difference in loglikelihood between nested models.

When I run

dLL = diff(loglikelihood.([l1,l2]))
chisqccdf(2, 2*dLL[1])

I get the same p value as I observe in R using lrtest::lmtest

nalimilan commented 1 year ago

Good catch. Taking the deviance or the log-likelihood gives the same result for model families where deviance equals minus twice the log-likelihood plus a constant term, but that's not the case for families with a dispersion parameter such as the normal. See https://github.com/JuliaStats/StatsModels.jl/pull/261.

nalimilan commented 1 year ago

Regarding the "second strange thing", it's expected that fitting a GLM with normal distribution gives different p-values for coefficients from fitting a LM due to the z-stat vs t-stat difference. R has chosen to always compute the t-statistic, but GLM.jl prefers to use the same test for all GLMs. Likewise, there's no reason to expect ftest and lrtest to give the same result.

JuliaStats / GLM.jl

lrtest gives incorrect results #490