Closed oxinabox closed 3 years ago
Merging #58 (55ea3db) into master (338e83f) will increase coverage by
0.00%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #58 +/- ##
=======================================
Coverage 92.45% 92.45%
=======================================
Files 8 8
Lines 1113 1114 +1
=======================================
+ Hits 1029 1030 +1
Misses 84 84
Impacted Files | Coverage Δ | |
---|---|---|
src/Lasso.jl | 90.42% <100.00%> (+0.03%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 338e83f...55ea3db. Read the comment docs.
One thought about this: The Glmnet R package (which this package reimplements based on the JSS paper) actually provides an automated choice of λ values even when doing Ridge regression (and I guess people use that choice frequently when calling Glmnet).
The JSS paper however does not document that choice (they just describe what's implemented here). By comparing outputs (i.e., without looking at code) I had determined that for Ridge what they are doing is basically to replace the line in λmax /= α
in compute_λ
by λmax *= 1000`.
That seems like a good follow up PR. I would like to get this simpler thing in as soon as I can as this is blocking me from updating to Lasso 0.6, and thus StatsBase 0.33 and thus a bunch of other things.
Of course! I just remembered about this (and thought it may be worth sharing) after I saw your PR.
Thanks. Looks good to me. @nignatiadis the follow up PR is most welcome. If you get to it, please make sure to add alpha=0 to this test.
Comments on pull requests will not trigger Registrator, as it is disabled. Please try commenting on a commit or issue.
Comments on pull requests will not trigger Registrator, as it is disabled. Please try commenting on a commit or issue.
Thanks @AsafManela. In fact, just yesterday, the he Glmnet authors uploaded a manuscript on arXiv about all the new features in Glmnet since their 2010 paper. There they also mention (page 8, footnote 3) what I wrote above about the choice of grid for ridge regression.
Their suggestion (and the Glmnet suggestion) is to use the λmax of α = 0.001 for all α < 0.001. If we follow their recommendation, then this would unfortunately be a breaking change for the range of 0<α < 0.001. (The alternative, of only changing α=0 is also not great, because of the discontinuity in the λ grid as α ->0.)
Thanks for the pointer. I don't think it would break too much code given that for alpha=0 one had to specify a lambda, but I guess you are saying 0<alpha<0.001 cases could be out there. We can tag a new minor version, so I would vote for following the glmnet approach.
Makes the check in #50 more circumspect.
This change fixed the code i have that uses Lasso to perform ridge regression that i mentioned in https://github.com/JuliaStats/Lasso.jl/pull/50#issuecomment-785036961