tlverse / hal9001

🤠 📿 The Highly Adaptive Lasso
https://tlverse.org/hal9001
GNU General Public License v3.0
49 stars 15 forks source link

Add `lambda.min.ratio` argument again #86

Closed Larsvanderlaan closed 3 years ago

Larsvanderlaan commented 3 years ago

When making the changes to the standardize argument, I accidentally removed some features including the lambda.min.ratio argument and the cv_select = FALSE prediction matrix fix. As the cv_select = FALSE bug is already fixed, I won't make any changes regarding this.

nhejazi commented 3 years ago

Thanks for the work on these changes. A few comments below. Let's try to break up this PR into smaller, incremental changes that we can better document, test, and review. For now, the CI builds are failing due to incomplete documentation or errors in required documentation fields. Specific comments:

jeremyrcoyle commented 3 years ago

lambda.min.ratio works for both gaussian and binomial families. The documentation just says that if small lambdas yield a saturated model, it wont continue to fit even smaller lambdas, which is fine, and not the same as silently ignoring the argument. I suggest we remove the check about lambda.min.ratio

Here's some code to confirm:

x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit1 = glmnet(x, y)
fit2 = glmnet(x,y,lambda.min.ratio=1e-6)
min(fit1$lambda)
min(fit2$lambda)
x = matrix(rnorm(100 * 20), 100, 20)

y2 = rbinom(100,1,0.5)
fit1 = glmnet(x, y2, family=binomial())
fit2 = glmnet(x,y2,lambda.min.ratio=1e-6)
min(fit1$lambda)
min(fit2$lambda)

We've seen that not setting lambda.min.ratio can lead to no lambda values that fit the data sufficiently well, and so I believe its appropriate to override the default.

What's the advantage of using dot args when we're trying to override the defaults of glmnet (and be explicit to the end user about it)?

rachaelvp commented 3 years ago

The warning that is suppressed is returned by glmnet and goes something like this: "Convergence for n th lambda value not reached after maxit=i iterations; solutions for larger lambdas returned".

This warning is ok, and is related to what Jeremy mentioned above about small lambdas yielding a saturated model and glmnet terminating the fitting for even smaller lambdas.

I only suppressed it because I thought the tests would fail if a warning was produced.

nhejazi commented 3 years ago

Got it, thanks for clarifying where the warnings are coming from. I’d always taken that warning to indicate a serious issue with fitting across the glmnet regularization path (since you don’t get predictions for each lambda), but fine to ignore, I guess. Sounds like it’s only showing up just now because we explicitly set lambda.min.ratio to a small value?