dhimmel / elevcan

Elevation and Cancer Incidence
https://doi.org/10.7717/peerj.705
Other
2 stars 0 forks source link

glmnet version 2.0-2 fails to properly load the methods package #1

Closed tlnagy closed 8 years ago

tlnagy commented 9 years ago

Hi David, I tried running your code as follows

Rscript ./code/run.R

and I'm getting the following error:

Loading required package: Matrix
Loading required package: foreach
Loaded glmnet 2.0-2

Error in is(x, "CsparseMatrix") : could not find function "new"
Calls: source ... <Anonymous> -> glmnet -> elnet -> getcoef -> drop0 -> is
Execution halted

Any ideas on why this could be?

sessioninfo:

R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin14.5.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Hmisc_3.16-0    Formula_1.2-1   survival_2.38-3 lattice_0.20-33 metafor_1.9-7   ggplot2_1.0.1   glmnet_2.0-2    foreach_1.4.2  
[9] Matrix_1.2-2   

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.6         cluster_2.0.3       magrittr_1.5        leaps_2.9           splines_3.2.2       MASS_7.3-43        
 [7] munsell_0.4.2       colorspace_1.2-6    stringr_1.0.0       plyr_1.8.3          tools_3.2.2         nnet_7.3-10        
[13] gtable_0.1.2        latticeExtra_0.6-26 iterators_1.0.7     digest_0.6.8        gridExtra_2.0.0     RColorBrewer_1.1-2 
[19] reshape2_1.4.1      acepack_1.3-3.3     codetools_0.2-14    rpart_4.1-10        labeling_0.3        stringi_0.5-5      
[25] scales_0.2.5        foreign_0.8-65      proto_0.3-10       
dhimmel commented 9 years ago

My guess is that it's related to different package versions. I will be away for the next few days, but should be able to diagnose the issue early next week. Thanks for your interest.

dhimmel commented 9 years ago

I think the problem lies in calling coef on rsubs (a leaps::regsubsets object) in the following line of code/create-models.R.

predictors <- names(coef(rsubs, optimal.index))[-1]

For some reason, coef is returning NULL. Perhaps the coef function changed in newer versions of R.

More diagnosis to come.

dhimmel commented 9 years ago

Reproducible coef bug example

Confirming that calling coef() on leaps::regsubsets object, as specified in the package doc, returns NULL:

library(leaps)

# 20 observations with 5 predictors
set.seed(0)
X <- matrix(rnorm(100), 20)
y <- rnorm(20)

# perform best subset regression
rsubs <- leaps::regsubsets(X, y)

# try to retrieve the coefficients for model with size 2
coef(rsubs, 2)

This occurs in R versions 3.1.3 and 3.2.2. The leaps package was last updated on 2009-05-05 (version 2.9), so perhaps this bug was introduced with R version 3. We will try to get in touch with the leaps developer, @tslumley.

tslumley commented 9 years ago

That's not what I see in R 3.2.1. And I don't see how coef.regsubsets can return NULL

> library(leaps)
> 
> # 20 observations with 5 predictors
> set.seed(0)
> X <- matrix(rnorm(100), 20)
> y <- rnorm(20)
> 
> # perform best subset regression
> rsubs <- leaps::regsubsets(X, y)
> 
> # try to retrieve the coefficients for model with size 2
> coef(rsubs, 2)
(Intercept)           a           d 
-0.03520462  0.17390467  0.32332662 
> methods("coef")
[1] coef.aov*        coef.Arima*      coef.default*    coef.listof*    
[5] coef.nls*        coef.regsubsets*
see '?methods' for accessing help and source code
tslumley commented 9 years ago

Also, does the example on the help page work for you? That's run as part of the CRAN package checks.

dhimmel commented 9 years ago

missing coef.regsubsets diagnosis

@tslumley, I was using leaps version 2.8 rather than the current 2.9. When I switched to 2.9, library(leaps) properly loads coef.regsubsets.

How did I end up with an outdated version of leaps? I wanted to specify which leaps version to install, so I used the following install command:

wget https://cran.r-project.org/src/contrib/Archive/leaps/leaps_2.8.tar.gz
R CMD INSTALL leaps_2.8.tar.gz

Version 2.8 was the latest available in the archives, and I mistakenly assumed it was the most recent. Why CRAN doesn't include the current version in the archives is beyond me. Basically, for a stable command to install a specific package version you need to resort to a hack.

tlnagy commented 9 years ago

I can confirm @tslumley's results. The leaps package works fine on my machine. That doesn't seem to be the problem here.

dhimmel commented 9 years ago

The leaps package works fine on my machine. That doesn't seem to be the problem here.

Agreed, the issue is not caused by leaps. However, I switched to explicitly loading and attaching leaps to be safe.

Error depends on Rscript

Now, I am getting the same error as @tlnagy originally reported. The error appears to occur in line 54 of create-models.R:

cv.lasso <- glmnet::cv.glmnet(X.mat, y, w, alpha=glmnet.alpha, standardize=FALSE)

I get the error when I execute the analysis from the shell using Rscript ./code/run.R. However, if I run the analysis by launching an R session from the project's root directory and then run source('./code/run.R'), the code progresses past create-models.R before another error occurs.

Will continue diagnosis, with the eventual goal of specifying the version information that successfully executes the analysis.

tslumley commented 9 years ago

I was using leaps version 2.8 rather than the current 2.9.

There are a bunch of recent tools for managing sets of packages in R. The webpage for the Aalborg useR 2015 meeting would be one place to start looking.

alifar76 commented 9 years ago

Hi, While not using your code, I'm facing a similar error message with cv.glmnet, when I run it via Rscript in terminal. My script calls following line of code:

cv.dat = cv.glmnet(x,y,grouped=FALSE,nfolds=length(y),alpha = num,parallel=TRUE,type.measure=typemeasure,family=familydist)

And here's where I get the same error message:

Error in is(x, "CsparseMatrix") : could not find function "new"
Calls: source ... <Anonymous> -> glmnet -> elnet -> getcoef -> drop0 -> is
Execution halted

There's no problem in running the code via the R GUI. Do you think it's a cv.glmnet related error related only to Rscript? Can't find a solution.

dhimmel commented 9 years ago

Solved: glmnet fails when run via Rscript

I boiled the problem down to a reproducible example, which fails via Rscript but succeeds in an interactive session:

library(glmnet)

# dataset dimensions
observations <- 100
predictors <- 300

# generate dataset
set.seed(0)
x <- matrix(rnorm(observations * predictors), nrow=observations)
y <- rnorm(observations)

# fit glment model
glmnet::cv.glmnet(x, y)

@alifar76, the issue is that Rscript doesn't load the methods package (hat tip @a-pankov). Thus I found adding library('methods') to the beginning of the script, enabled executing via Rscript. Will email Trevor Hastie, glmnet maintainer, about this bug.

a-pankov commented 9 years ago

@dhimmel, take a look at the checkpoint package. @jfreimer showed me the package and it seems like a good solution for these type of issues.

tlnagy commented 9 years ago

This is why I wish R had something like virtualenvs and https://github.com/conda/conda like Python makes package version clashing and other related problems a thing of the past.

dhimmel commented 9 years ago

I wish R had something like virtualenvs and conda

@tlnagy, conda recently began supporting R. I have been using conda for R and enjoying it, but with a few caveats:

take a look at the checkpoint package.

@a-pankov, will do. I also came across packrat.

alifar76 commented 9 years ago

Awesome! Thanks a lot for having this solved @dhimmel! :+1:

It'll be good to reach out to Trevor Hastie and inform him about this issue. Cheers!

tinyheero commented 8 years ago

Hi,

Thanks for posting this. I ran into this error and library("methods") solved it for me. Out of curiosity, did Trevor Hastie respond fix the problem in glmnet?

dhimmel commented 8 years ago

@tinyheero, Hastie did not respond. Consider also emailing him in case he missed my message. Also perhaps gently nudge him to develop the code on github as that would make issue reporting and resolution much easier.

dhimmel commented 8 years ago

I added a sessionInfo() call to code/run.R, so version information is now retained. See this commit and corresponding version info for a setup where the analysis runs without error.

The original issue @tlnagy encountered was caused by a glmnet update. We identified a workaround, which is to add library(methods) when using glmnet via Rscript.

dhimmel commented 8 years ago

glmnet 2.0-3 triggers a new error when running via Rscript and not loading methods

The glmnet 2.0-3 update did something with respect to loading the methods package.

The reproducible example succeeds with v2.0-3 in an interactive R session. However, it fails when executed via Rscript with the following output:

Loading required package: Matrix
Loading required package: foreach
Loaded glmnet 2.0-3

Error in rbind2sparse(x, y) : could not find function "checkAtAssignment"
Calls: <Anonymous> ... rbind2 -> rbind2 -> rbind2 -> rbind2 -> rbind2sparse
Execution halted

As in v2.0-2, explicitly loading methods (i.e. library(methods)) solves the problem.

Takeaway: Always use the double colon operator when referencing functions in packages. For example, methods::checkAtAssignment or methods::rbind2.