Closed skywang0407 closed 4 years ago
Do we need to change the grid of penalty?
What is the RMSE when the penalty is 0?
Hey Dr. McGowan@LucyMcGowan, I might have the same issue. I get these yellow warning in my console:
! Fold08: internal: A correlation computation is required, but estimate
is constant and...
And for all penaltys among 0, 1, 2 to 100, the RMSE all remains the same. The following is my code. This issue only happens on my lasso. My ridge works fine with almost the same code.
lasso_spec <- linear_reg(penalty = tune(), mixture = 1) %>% set_engine("glmnet") music_train_cv<-vfold_cv(music_train,v=10)
grid <- expand_grid(penalty = seq(0, 100, by = 1))
results_lasso <- tune_grid(lasso_spec, preprocessor = rec, grid = grid, resamples = music_train_cv)
results_lasso %>% collect_metrics() %>% filter(.metric == "rmse") %>% arrange(mean)
Any suggestion is appreciated. Thanks!
@LutionHan, I have the same errors with the lasso and elastic net regressions but not the ridge.
I think it may have something to do with the preprocessing since it says that "the estimate
Is constant and has a 0 standard deviation, resulting in a divide by 0 error."
When defining a recipe for preprocessing, we use step_scale(all_predictors()) which divides each predictor by its standard deviation, so I'm assuming it's an issue with that, though I'm not sure why the errors do not appear for the ridge regression.
I have noticed that changing with the mixture
value when defining the regression model (using linear_reg()
) elicits the error when running tune_grid()
but cannot figure out how exactly. A mixture of 0.01 does not yield any errors; a mixture of 0. 05 elicits errors in folds 3,5,7 but not the others; a mixture >=0.06, however, elicits that error in all folds.
As for the original question @skywang0407, I also get the same rmse values if the penalty is between 10 or 100. However, if the penalty is between 0 and 10 (@LucyMcGowan), the mean rmse changes (to some degree, see attached).
If I had to venture a guess, this means that any lasso regression with at least a penalty of 7 performs equally well. If you want to validate this, simply check whether the mean rmse values for different penalties are equal (see below):
@skywang0407, I hope that answers your question and--from another student's perspective-- think your code is good. @LutionHan I get the same errors as you but am not sure exactly how to deal with them. I think the code works properly either way, but would appreciate any thoughts you or @LucyMcGowan have.
@LucyMcGowan I am also having the issue regarding the following error message:
! Fold08: internal: A correlation computation is required, but estimate is constant and..
Similarly, this is not a problem for when mixture = 0 for the ridge regression, but it pops up for lasso regression. Is this "error/warning" something to worry about?
I am still trying to get to the bottom of this, but for the time being ignore this error - I think it is because all of the estimates are the same for the penalty values provided (so there is no standard deviation)
@LucyMcGowan My means vary when I use
grid <- expand_grid(penalty = seq(0, 5, by = .5))
Are these appropriate numbers for tuning?
@ConnorReardon, we used seq(0,100, by = 10) in class, but I'm not sure what the correct protocol for choosing those are.
@LucyMcGowan However, if I ignore these warnings then I am not able to select a best penalty/mixture since all these estimates are same. Are we allowed to just conclude that "all choices of penalty/mixture make no difference at all"?
@LutionHan Try something like this: grid <- expand_grid(penalty = seq(0, 10, by = 1))
The estimates change for different penalties between 0 and 10.
@jdtrat I just tried and get the same issue. All the estimates are the same:
@LutionHan what's your recipe? This is for lasso with the same code from earlier in this thread?
@jdtrat Here is my code: lasso_spec <- linear_reg(penalty = tune(), mixture = 1) %>% set_engine("glmnet") music_train_cv<-vfold_cv(music_train,v=10)
rec_lasso <- recipe(lat ~ ., data = music_train) %>% step_scale(lat) grid_lasso <- expand_grid(penalty = seq(0, 10, by = 1))
results_lasso <- tune_grid(lasso_spec, preprocessor = rec_lasso, grid = grid_lasso, resamples = music_train_cv)
results_lasso %>% collect_metrics() %>% filter(.metric == "rmse") %>% arrange(mean)
@LutionHan I think the issue is you're not scaling all predictors. Try:
rec_lasso <- recipe(lat ~ ., data = music_train) %>% step_scale(all_predictors())
@jdtrat That works! Thank you so much!
PS. I am just wondering that, in lecture slides we use "%>%step_scale(The response variable)" and it works well, are there any difference in these two cases?
@LutionHan Glad it works!
I am not sure about that slide. My understanding is that step_scale applies to all the predictors in the model so they can be equally weighted. Maybe Dr. McGowan could elucidate.
@jdtrat Thanks anyway!
Thank you everyone for weighing in and helping! There is a bit of an art to picking the penalty, if they are all giving the same result (which I think is what that error is referring to) then you can try to vary the penalties chosen to see if you can get some variability. If that doesn't work, you can just note the error and choose one at random since they all give the same result (and note that you have done this) @LutionHan can you link to the slide where I say that you should scale the response variable? That is probably a typo, you want to scale all of the predictors not the outcome.
@LucyMcGowan I am sorry that I misunderstand the slides. I check the slides again and find out that in the lecture example we only have one predictor (horsepower) so we only have to scale one variable, and I misunderstand that it is the response. Thanks for your and @jdtrat 's help!
@LucyMcGowan so when the increase in penalty yields the same RMSE, do we think that this is an R internal error (seems unlikely) or maybe the optimized coefficient values hit a local minimum at those penalty values?
Hi Professor McGowan, When I am finding the lambda for the lasso model, all penalties show the same value of mean. Can you please help me check where is wrong?
Thanks!