Closed tlnagy closed 8 years ago
My guess is that it's related to different package versions. I will be away for the next few days, but should be able to diagnose the issue early next week. Thanks for your interest.
I think the problem lies in calling coef on rsubs (a leaps::regsubsets
object) in the following line of code/create-models.R
.
predictors <- names(coef(rsubs, optimal.index))[-1]
For some reason, coef is returning NULL
. Perhaps the coef function changed in newer versions of R.
More diagnosis to come.
coef
bug exampleConfirming that calling coef()
on leaps::regsubsets
object, as specified in the package doc, returns NULL
:
library(leaps)
# 20 observations with 5 predictors
set.seed(0)
X <- matrix(rnorm(100), 20)
y <- rnorm(20)
# perform best subset regression
rsubs <- leaps::regsubsets(X, y)
# try to retrieve the coefficients for model with size 2
coef(rsubs, 2)
This occurs in R versions 3.1.3 and 3.2.2. The leaps package was last updated on 2009-05-05 (version 2.9), so perhaps this bug was introduced with R version 3. We will try to get in touch with the leaps developer, @tslumley.
That's not what I see in R 3.2.1. And I don't see how coef.regsubsets
can return NULL
> library(leaps)
>
> # 20 observations with 5 predictors
> set.seed(0)
> X <- matrix(rnorm(100), 20)
> y <- rnorm(20)
>
> # perform best subset regression
> rsubs <- leaps::regsubsets(X, y)
>
> # try to retrieve the coefficients for model with size 2
> coef(rsubs, 2)
(Intercept) a d
-0.03520462 0.17390467 0.32332662
> methods("coef")
[1] coef.aov* coef.Arima* coef.default* coef.listof*
[5] coef.nls* coef.regsubsets*
see '?methods' for accessing help and source code
Also, does the example on the help page work for you? That's run as part of the CRAN package checks.
coef.regsubsets
diagnosis@tslumley, I was using leaps version 2.8 rather than the current 2.9. When I switched to 2.9, library(leaps)
properly loads coef.regsubsets
.
How did I end up with an outdated version of leaps? I wanted to specify which leaps version to install, so I used the following install command:
wget https://cran.r-project.org/src/contrib/Archive/leaps/leaps_2.8.tar.gz
R CMD INSTALL leaps_2.8.tar.gz
Version 2.8 was the latest available in the archives, and I mistakenly assumed it was the most recent. Why CRAN doesn't include the current version in the archives is beyond me. Basically, for a stable command to install a specific package version you need to resort to a hack.
I can confirm @tslumley's results. The leaps
package works fine on my machine. That doesn't seem to be the problem here.
The
leaps
package works fine on my machine. That doesn't seem to be the problem here.
Agreed, the issue is not caused by leaps
. However, I switched to explicitly loading and attaching leaps
to be safe.
Rscript
Now, I am getting the same error as @tlnagy originally reported. The error appears to occur in line 54 of create-models.R
:
cv.lasso <- glmnet::cv.glmnet(X.mat, y, w, alpha=glmnet.alpha, standardize=FALSE)
I get the error when I execute the analysis from the shell using Rscript ./code/run.R
. However, if I run the analysis by launching an R session from the project's root directory and then run source('./code/run.R')
, the code progresses past create-models.R
before another error occurs.
Will continue diagnosis, with the eventual goal of specifying the version information that successfully executes the analysis.
I was using leaps version 2.8 rather than the current 2.9.
There are a bunch of recent tools for managing sets of packages in R. The webpage for the Aalborg useR 2015 meeting would be one place to start looking.
Hi, While not using your code, I'm facing a similar error message with cv.glmnet, when I run it via Rscript in terminal. My script calls following line of code:
cv.dat = cv.glmnet(x,y,grouped=FALSE,nfolds=length(y),alpha = num,parallel=TRUE,type.measure=typemeasure,family=familydist)
And here's where I get the same error message:
Error in is(x, "CsparseMatrix") : could not find function "new"
Calls: source ... <Anonymous> -> glmnet -> elnet -> getcoef -> drop0 -> is
Execution halted
There's no problem in running the code via the R GUI. Do you think it's a cv.glmnet related error related only to Rscript? Can't find a solution.
I boiled the problem down to a reproducible example, which fails via Rscript but succeeds in an interactive session:
library(glmnet)
# dataset dimensions
observations <- 100
predictors <- 300
# generate dataset
set.seed(0)
x <- matrix(rnorm(observations * predictors), nrow=observations)
y <- rnorm(observations)
# fit glment model
glmnet::cv.glmnet(x, y)
@alifar76, the issue is that Rscript doesn't load the methods
package (hat tip @a-pankov). Thus I found adding library('methods')
to the beginning of the script, enabled executing via Rscript. Will email Trevor Hastie, glmnet maintainer, about this bug.
@dhimmel, take a look at the checkpoint package. @jfreimer showed me the package and it seems like a good solution for these type of issues.
This is why I wish R had something like virtualenvs and https://github.com/conda/conda like Python makes package version clashing and other related problems a thing of the past.
I wish R had something like virtualenvs and conda
@tlnagy, conda recently began supporting R. I have been using conda for R and enjoying it, but with a few caveats:
take a look at the checkpoint package.
@a-pankov, will do. I also came across packrat.
Awesome! Thanks a lot for having this solved @dhimmel! :+1:
It'll be good to reach out to Trevor Hastie and inform him about this issue. Cheers!
Hi,
Thanks for posting this. I ran into this error and library("methods")
solved it for me. Out of curiosity, did Trevor Hastie respond fix the problem in glmnet?
@tinyheero, Hastie did not respond. Consider also emailing him in case he missed my message. Also perhaps gently nudge him to develop the code on github as that would make issue reporting and resolution much easier.
I added a sessionInfo()
call to code/run.R
, so version information is now retained. See this commit and corresponding version info for a setup where the analysis runs without error.
The original issue @tlnagy encountered was caused by a glmnet update. We identified a workaround, which is to add library(methods)
when using glmnet via Rscript
.
glmnet 2.0-3
triggers a new error when running via Rscript
and not loading methods
The glmnet 2.0-3
update did something with respect to loading the methods
package.
The reproducible example succeeds with v2.0-3
in an interactive R session. However, it fails when executed via Rscript
with the following output:
Loading required package: Matrix
Loading required package: foreach
Loaded glmnet 2.0-3
Error in rbind2sparse(x, y) : could not find function "checkAtAssignment"
Calls: <Anonymous> ... rbind2 -> rbind2 -> rbind2 -> rbind2 -> rbind2sparse
Execution halted
As in v2.0-2
, explicitly loading methods
(i.e. library(methods)
) solves the problem.
Takeaway: Always use the double colon operator when referencing functions in packages. For example, methods::checkAtAssignment
or methods::rbind2
.
Hi David, I tried running your code as follows
and I'm getting the following error:
Any ideas on why this could be?
sessioninfo: