pbreheny / grpreg

Regularization paths for regression models with grouped covariates
http://pbreheny.github.io/grpreg/
34 stars 14 forks source link

Incorrect calculation of `loss` variable in a model returned by `grpreg(...)` #53

Closed SzymonNowakowski closed 3 months ago

SzymonNowakowski commented 2 years ago

There is fit$loss variable if fit is a model returned by grpreg(...) call. And fit$loss is equal to RSS, but it should be equal to 1/2 of RSS, i.e. negative log-likelihood, according to Models vignette, Gaussian (linear regression) Section. See the reproducible example below:

> library(glmnet)
> library(grpreg)
> data(QuickStartExample)
> x <- QuickStartExample$x
> y <- QuickStartExample$y

> fit<-grpreg(stats::model.matrix(y~., data = data.frame(y=y, x, check.names = TRUE))[,-1, drop=FALSE], y, group = 1:20, penalty="grLasso")
> sum((cbind(rep(1,100),x) %*% fit$beta[,10] - y)^2)   #pure RSS - no factor of 1/2 here !!!
[1] 363.5578
> fit$loss[10]
[1] 363.5578

The correct value would be that of 1/2 of RSS.

SzymonNowakowski commented 2 years ago

On the other hand, grpreg instructions on a bottom of page 15 clearly state that the loss variable is equal to RSS in case of a gaussian family.

I leave this issue for your consideration as a possible inconsistency between different sources of information on your package.