Open BenjaminPlanterose opened 2 years ago
Hello Benjamin,
thanks a lot for your detailed report! Extremely nice issue documentation.
I haven't been able to dig deeper yet, as I was quite busy lately. Just wanted to mention, that I have not forgotten this issue and it is on my to-do list :)
Dear Steffen,
I believe there are some errors in the expression employed to compute the standard errors of the ridge linear coefficients (ridge:::vcov.ridgeLinear). Specifically, it always penalizes the intercept (even when argument scaling = "none" on the ridge::linearRidge call) and it uses an incomplete formula. The proper formula for the Ridge standard errors is:
Where Gamma is the unitary matrix of dimensions (m+1)×(m+1) but where element i=1, j=1, is set to zero to avoid penalizing the intercept (only in the case of scaling = "none"). And it is not as programmed in ridge:::vcov.ridgeLinear:
You may find references for this formula below: https://www.statlect.com/fundamentals-of-statistics/ridge-regression https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Ridge_Regression.pdf https://stats.stackexchange.com/questions/2121/how-can-i-estimate-coefficient-standard-errors-when-using-ridge-regression
See also below a script reproducing what I believe the proper Ridge standard errors are:
I am aware that due to the lack of scale invariance of the ridge estimator, that center-scaling is recommended (and hence dropping the intercept column). Also, the degrees of freedom in ridge regression are not really n-(m+1) (some authors speak of effective degrees of freedom as n minus the trace of the hat matrix).
FYI, I am using ridge_3.0 on R version 4.1.0 (2021-05-18), platform: x86_64-w64-mingw32/x64 (64-bit), running under: Windows >= 8 x64 (build 9200).
Please let me know if I have overseen anything. Have a nice day,
Best, Ben.