Closed jerlich closed 1 year ago
I did some digging, and it seems to be the lrtest
is incorrectly using the difference in deviance
between nested models, when it should be using the difference in loglikelihood
between nested models.
When I run
dLL = diff(loglikelihood.([l1,l2]))
chisqccdf(2, 2*dLL[1])
I get the same p value as I observe in R using lrtest::lmtest
Good catch. Taking the deviance or the log-likelihood gives the same result for model families where deviance equals minus twice the log-likelihood plus a constant term, but that's not the case for families with a dispersion parameter such as the normal. See https://github.com/JuliaStats/StatsModels.jl/pull/261.
Regarding the "second strange thing", it's expected that fitting a GLM with normal distribution gives different p-values for coefficients from fitting a LM due to the z-stat vs t-stat difference. R has chosen to always compute the t-statistic, but GLM.jl prefers to use the same test for all GLMs. Likewise, there's no reason to expect ftest
and lrtest
to give the same result.
I created a gist to show the issue and provide a MWE https://gist.github.com/jerlich/9620fcf758a3557f580981e61f357e38
In a nutshell
lrtest
gives inconsistent results with ftest in GLM.jl and with results from R.