tune() `alpha` and `lambda` hyperparameters in XGBoost

afewmoments commented 3 years ago

I want to be able to tune() all available hyperparameters in xgboost (in particular lambda and alpha). It seems that I can only tune() what is listed in the help https://parsnip.tidymodels.org/reference/details_boost_tree_xgboost.html:

tree_depth
trees
learn_rate
mtry
min_n
loss_reduction
sample_size
stop_iter

Currently, to specify lambda I need to specify explicitly:

boost_tree("regression",
    trees = tune(),
    learn_rate = tune()) %>%
set_engine("xgboost", lambda = 100)

How can I tune() lambda in that case? In general, why can't I tune() all hyperparameters available on https://xgboost.readthedocs.io/en/latest/parameter.html

juliasilge commented 3 years ago

You can read a bit more about tuning engine-specific parameter in this blog post.

If you feel that it is likely that many people will want to tune lambda, we could move this over to dials and keep this as an open issue there, considering it for prioritization to add to the engine-specific parameters for xgboost.

In the meantime, what you need to do is create a parameter function and then a parameter set:

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip

xgb_spec <-
   boost_tree("regression",
              trees = tune(),
              learn_rate = 0.02) %>%
   set_engine("xgboost", lambda = tune())

lambda <- function(range = c(-10, 0), trans = log10_trans()) {
   new_quant_param(
      type = "double",
      range = range,
      inclusive = c(TRUE, TRUE),
      trans = trans,
      label = c(lambda = "Amount of Regularization"),
      finalize = NULL
   )
}

param_set <- parameters(list(lambda(), trees()))
car_folds <- vfold_cv(mtcars, v = 3)
tune_grid(xgb_spec, mpg ~ ., resamples = car_folds, param_info = param_set)
#> # Tuning results
#> # 3-fold cross-validation 
#> # A tibble: 3 × 4
#>   splits          id    .metrics          .notes          
#>   <list>          <chr> <list>            <list>          
#> 1 <split [21/11]> Fold1 <tibble [20 × 6]> <tibble [0 × 1]>
#> 2 <split [21/11]> Fold2 <tibble [20 × 6]> <tibble [0 × 1]>
#> 3 <split [22/10]> Fold3 <tibble [20 × 6]> <tibble [0 × 1]>

^{Created on 2021-08-05 by the reprex package (v2.0.0)}

afewmoments commented 3 years ago

@juliasilge thank you for this solution! It would be kind of you to move my request to dials, because it seems to me, as xgboost grows in popularity, there will be an increase of people willing to tune specific parameters.

juliasilge commented 3 years ago

Next step is to add the engine-specific lambda and alpha here. This might be a good opportunity for a first-time contributor.

joeycouse commented 3 years ago

Working on a PR for these new xgboost specific tuning parameters, but creating the alpha() parameter masks scales::alpha(), which I'm assuming we'd like to avoid. Is there a SOP for this situation?

Any thoughts on using the existing code for penalty() and mixture() and having some conversion that maps to the corresponding xgboost specific parameters lambda and alpha? I'd imagine that'd be pretty complex, and wouldn't match the true xgboost parms here

This is more of parsnip complaint but I think the documentation could use some clarification on what is really an engine specify tuning parameter:

For instance the boost_tree() docs have stop_iter() as (specific engines only). This is only tunable with engine xgboost but it can it be tuned within boost_tree(stop_iter = tune())

Shouldn't this parameter get the same treatment as scale_pos_weight() and have to be tuned within set_engine('xgboost', stop_iter = tune()) since it is only tunable with the xgboost engine?

Interested in your thoughts!

juliasilge commented 3 years ago

As far as what is a main argument vs. an engine-specific argument, here is how we lay it out in TMwR:

Modeling functions in parsnip separate model arguments into two categories:

Main arguments are more commonly used and tend to be available across engines.

Engine arguments are either specific to a particular engine or used more rarely.

It is absolutely somewhat subjective and a judgment call.

I think the idea with stop_iter is that, although I am only aware of xgboost having it, it is used a lot by xgboost users so it is more helpful for us to fully support it as a main argument. I don't think the same argument can be made for scale_pos_weight. These are definitely subjective calls, though.

Thank you for being willing to contribute a PR to add these engine-specific parameters! 🙌 @hfrick do you have an opinion on how to name alpha and lambda to reduce confusion/masking? I know I use alpha a lot as an argument name in ggplot2 but I don't know that I use the function that much.

hfrick commented 3 years ago

I think penalty_L1() and penalty_L2() would be good! This would be nice to put in the next release for dials which we are aiming for for next week. @joeycouse would that timeline work for you or would you prefer us adding it in ourselves?

joeycouse commented 3 years ago

I'll try my best! I just opened a PR but I think some edits will need to be done on the parsnip side.

hfrick commented 3 years ago

Thank you @joeycouse !

hfrick commented 3 years ago

Closed in #179

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

tidymodels / dials

tune() `alpha` and `lambda` hyperparameters in XGBoost #176