TaddyLab / gamlr

Gamma lasso regression
22 stars 9 forks source link

Is the corrected AIC correct? #17

Open pjgorski opened 10 months ago

pjgorski commented 10 months ago

The current version of the corrected AIC has the correction that does not vanish with increasing $n$. It is reported that the formula follows Hurvich and Tsai, 1989.

In the paper, the formula is as follows: image

So the term $kdn/(n-d-1)$, which is equal to $\dfrac{2(n-m)n}{m-1}$, is not equal to the term from the paper.

mataddy commented 10 months ago

yep it’s correct.  d = n-m in their special case.  read up on the degrees of freedom section.On Nov 28, 2023, at 4:08 AM, pjgorski @.***> wrote: The current version of the corrected AIC has the correction that does not vanish with increasing $n$. It is reported that the formula follows Hurvich and Tsai, 1989. In the paper, the formula is as follows: image.png (view on web) So the term $kdn/(n-d-1)$, which is equal to $\dfrac{2(n-m)n}{m-1}$, is not equal to the term from the paper.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

pjgorski commented 10 months ago

You're right. Your AICc is correct.

However, what you wrote, $d=n-m$, is incorrect for your implementation. $n-m$ is degrees of freedom for LR models. $d$ in your implementation is degrees of freedom of logLik function, which is something different (I think it is the number of parameters that would be estimated when doing MLE). That is why the documentation of AICc was misleading to me when I read it.

(I didn't understand what you meant by "read up on the degrees of freedom section". I didn't see such a section in the mentioned paper.)

As a sidenote, I will point that AICc implementation from qpcR library is incorrect. Their implemented function follows the function from wikipedia but they just use the number of coefficients from LR and they do not include the variance estimate one.