pbreheny / grpreg

Regularization paths for regression models with grouped covariates
http://pbreheny.github.io/grpreg/
34 stars 14 forks source link

Error in X[, ind, drop = FALSE] : subscript out of bounds error #17

Closed jonhersh closed 6 years ago

jonhersh commented 6 years ago

Hi there,

I'm really excited about the package. Thanks for all your hard work on it.

I'm getting a pesky subscript out of bounds error when I run the program. I don't think I'm making an error but I could be wrong. Can you advise whether this is a bug on my end or yours?

require(RCurl)

DF <-read.csv(text=getURL(
  "https://raw.githubusercontent.com/jonhersh/datasets/master/myDF2.csv"), , header=T)

Xtrain <- DF[,1:74]
Ytrain <- DF[,75]

groups <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5,5,6,6,6,7,7,7,7,7,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9)
groups.factor <- factor(groups, labels = c("head","hh demographics","dwelling","public services","durables","education","region/sector","head LF","hh LF"))

# group lasso using grplasso
grpreg.fit <- cv.grpreg(Xtrain,Ytrain,group = groups.factor)
pbreheny commented 6 years ago

The error is coming up because you have a lot of columns that are constants (i.e., all 0 or all 1). If you drop them, grpreg works fine:

constant <- apply(X, 2, sd) < 1e-10
X <- X[,!constant]
g <- groups.factor[!constant]
grpreg.fit <- cv.grpreg(X, Y, group=g)

Now, grpreg is supposed to handle this automatically, so the user doesn't have to manually remove the constant columns first, but this apparently isn't covering all possible cases. I've taken a look and unfortunately it's not a simple fix, so I'll need to keep working on it. In the meantime, you can work around it by dropping the constant columns yourself as in the above code.

Thank you very much for pointing this out and filing an issue -- some design matrix issues are hard to anticipate.

pbreheny commented 6 years ago

This issue should be fixed as of grpreg_3.1-4. If you are still having issues, or find anything else, please let me know.