Closed nalimilan closed 2 years ago
Merging #477 (c34c21a) into master (42a0d04) will not change coverage. The diff coverage is
n/a
.
@@ Coverage Diff @@
## master #477 +/- ##
=======================================
Coverage 85.12% 85.12%
=======================================
Files 7 7
Lines 827 827
=======================================
Hits 704 704
Misses 123 123
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 42a0d04...c34c21a. Read the comment docs.
@nalimilan The null model is "just" y ~ 1
, so it would be trivial to create a null model using the response of the original and a column of 1's as X
.
Or am I missing something ?
Yeah, I just wonder whether we should compute this more efficiently. After looking into it, I have a commit here that computes the null deviance and log-likelihood correctly simply by defining the null model as taking the predicted response to be the mean response. This works for all models we currently test, except those with offsets. Do you think there exist a direct formula for this case too?
@nalimilan did you push the commit? For Poisson at least I think there might be a few easy/fast cases even with offset. I'll have to work through the algebra later to be sure there.
I've just filed https://github.com/JuliaStats/GLM.jl/pull/479. Let me know if you can find the formulas. Though if some cases have no closed-form solution we'll have to rely on fitting the null model at least for them...
FWIW, R's ?glm
has this warning about offsets:
null.deviance: The deviance for the null model, comparable with ‘deviance’. The null model will include the offset, and an intercept if there is one in the model. Note that this will be incorrect if the link function depends on the data other than through the fitted mean: specify a zero offset to force a correct calculation.
Does r^2
have to return a pseudo-r2? I ask because there are many definitions of pseudo-r2, so it can be confusing. This could also cause mistakes for users expecting the function to return only the actual r^2
. (I would assume this myself.)
r2
only returns the pseudo-R² if you pass a second argument specifying which variant you want.
Superseded by https://github.com/JuliaStats/GLM.jl/pull/479.
r2
does not work for GLMs currently (even with two arguments) as they don't implementnullloglikelihood
. Also add a mention aboutadjr2
.Fixes https://github.com/JuliaStats/GLM.jl/issues/475.
If somebody has an idea regarding how
nullloglikelihood
could be implemented... Maybe all we need to do is define for each link what the prediction would be for all observations under the null model, and callloglik_obs
on that like we do forloglikelihood
?