pbreheny / biglasso

biglasso: Extending Lasso Model Fitting to Big Data in R
http://pbreheny.github.io/biglasso/
113 stars 29 forks source link

Warnings/errors on run #25

Closed GabeAl closed 5 years ago

GabeAl commented 5 years ago

I have run biglasso on a similar dataset as reported previously (mostly sparse, other whole numbers cast as numeric). I made sure nothing was NA, infinite, or NaN and type was numeric (I remember the integer bug from before).

glm.trained = cv.biglasso(as.big.matrix(X),factor(y),family="binomial"ncores=16)
Warning message:
In mean.default(y) : argument is not numeric or logical: returning NA
> plot(glm.trained)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
> glm.trained$cve
numeric(0)
> glm.trained$cvse
numeric(0)
> glm.trained$lambda
numeric(0)

Is this expected? I have never been able to get bigLasso to work on real problems for me like glmnet -- maybe my use cases are too unconventional?

> sum(is.na(X))
[1] 0
> sum(is.infinite(X))
[1] 0
> sum(is.nan(X))
[1] 0
str(X)
 num [1:72, 1:1000926] 0 35 0 0 59 0 456 0 0 0 ...
sum(rowMeans(X)==0)
[1] 0

Glmnet runs fine. Other fields within the output, glm.trained, are populated, such as $y (with 0's and 1's that line up reasonably well with the actual classes), center and scale, and the coefficient matrix looks like it has numbers in it (quite sparse, as expected).

Thanks for any insight. I can try to finagle a minimal reproducible example if the problem isn't obvious with my command and the error messages!

privefl commented 5 years ago
y <- c(0, 1)
is.numeric(factor(y))
GabeAl commented 5 years ago

Done. That fixed it.

Going to open a new feature request to auto-cast variables to whatever format it internally expects -- inevitably I'm going to run into more just like before with the int/numeric thing and the big.matrix thing (although at least the latter is in the vignettes).

Thanks again!