I believe I have found a possible bug in group coefficients constraint for penalty="grLasso" in grpreg. Penalty vignette states "the coefficients within a group will either all equal zero or none will equal zero". This constraint seems to be broken in a data example I came across.
To reproduce this potential bug please execute:
#load data example from github
library(Rfssa)
load_github_data("https://github.com/SzymonNowakowski/DMRnet/blob/testing_branch/data/promoX.RData")
load_github_data("https://github.com/SzymonNowakowski/DMRnet/blob/testing_branch/data/promoy.RData")
#prepare data in grpreg-accepted format
y <- ifelse(y == levels(y)[2], 1, 0)
X <- stats::model.matrix(y~., data = data.frame(y=y, X, check.names = TRUE))[, -1, drop=FALSE]
group <- rep(1:57, each = 3)
#run grpreg
library(grpreg)
fit<-grpreg(X, y, group = group, penalty="grLasso", family="binomial")
#examine coefficients
coef(fit)[44:46,1:9]
# 0.2199 0.2133 0.2069 0.2008 0.1948 0.189 0.1834 0.1779 0.1726
# X15c 0 -2.139903e-17 8.559611e-17 1.069951e-16 2.139903e-16 0.0000000 5.991728e-16 3.423845e-16 0.0000000
# X15g 0 3.998654e-02 7.887908e-02 1.167650e-01 1.537286e-01 0.1898477 2.251939e-01 2.598331e-01 0.2938264
# X15t 0 9.878343e-02 1.948222e-01 2.882890e-01 3.793487e-01 0.4681499 5.548270e-01 6.395019e-01 0.7222856
It is probably a numerical problem - you'll notice some coefficients very close to 0 in the first row X15c of the output, however for the lambdas 0.189 and 0.1726 it is not close to 0, but exactly 0, breaking the abovementioned constraint.
Thank you in advance for having a look into it,
Szymon
PS. The data I used is a subset of Promoter dataset isolated for the purpose of reproducing this behavior.
Hi,
I believe I have found a possible bug in group coefficients constraint for
penalty="grLasso"
ingrpreg
. Penalty vignette states "the coefficients within a group will either all equal zero or none will equal zero". This constraint seems to be broken in a data example I came across.To reproduce this potential bug please execute:
It is probably a numerical problem - you'll notice some coefficients very close to 0 in the first row
X15c
of the output, however for the lambdas 0.189 and 0.1726 it is not close to 0, but exactly 0, breaking the abovementioned constraint.Thank you in advance for having a look into it, Szymon PS. The data I used is a subset of Promoter dataset isolated for the purpose of reproducing this behavior.