pbreheny / ncvreg

Regularization paths for SCAD- and MCP-penalized regression models
http://pbreheny.github.io/ncvreg
41 stars 28 forks source link

Basic R parallelization of cv.ncvreg #2

Closed grantbrown closed 9 years ago

grantbrown commented 9 years ago

Adds a function "pa.cv.ncvreg" and associated documentation which allows cross validation calls to ncvreg to be made and executed in parallel. Adds library dependency "parallel", which is included in versions of R >= 2.14.

Example usage and timings are available here

In addition, I've included the data files usually distributed with the package, in order to pass R CMD check (let me know if that's a problem, and I can revert it).

pbreheny commented 9 years ago

Hi Grant,

Thank you very much for this great addition to ncvreg. I've incorporated your changes in ncvreg version 3.3, which I've just submitted to CRAN. I used your code to rewrite cv.ncvreg, although I didn't merge your exact changes, because I wanted to set things up slightly differently. The main difference is that I didn't want a separate pa.cv.ncvreg function; I just added the parallel support directly into cv.ncvreg -- you can run it in parallel by passing the cluster as an argument:

cl <- makeCluster(4)
cvfit <- cv.ncvreg(X, y, cluster=cl, nfolds=length(y))

Thanks again for this addition, and sorry for the delay in my incorporating it -- I was in the middle of big update of ncvreg to include survival models, and wanted to get that sorted out before working on this.

grantbrown commented 9 years ago

No problem, glad you found it useful. In retrospect, I definitely agree that your modifications make more sense than a separate parallel function - less code to maintain, and it keeps the package namespace cleaner.