Closed joeycouse closed 3 years ago
Looks like we don't have other engine-specific tuning parameters for the "xgboost"
engine set up yet. Are there others we should consider adding with this one?
Yes, we can make dials
objects for these and set up the other bits in tune
to make this more seamless.
In the meantime, you can tune it by explicitly specifying the grid:
library(tidyverse)
library(tidymodels)
#> ── Attaching packages ────────────────────────────────────── tidymodels 0.1.1 ──
#> ✓ broom 0.7.0 ✓ recipes 0.1.15.9000
#> ✓ dials 0.0.9.9000 ✓ rsample 0.0.8.9000
#> ✓ infer 0.5.2 ✓ tune 0.1.2.9000
#> ✓ modeldata 0.1.0 ✓ workflows 0.2.1
#> ✓ parsnip 0.1.4.9000 ✓ yardstick 0.0.7.9000
#> ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
#> x scales::discard() masks purrr::discard()
#> x dplyr::filter() masks stats::filter()
#> x recipes::fixed() masks stringr::fixed()
#> x dplyr::lag() masks stats::lag()
#> x yardstick::spec() masks readr::spec()
#> x recipes::step() masks stats::step()
library(mlbench)
data("PimaIndiansDiabetes")
set.seed(24)
df <- PimaIndiansDiabetes %>%
mutate(diabetes = fct_relevel(diabetes, 'pos'))
xgb_model_1 <-
boost_tree(trees = 150,
tree_depth = 3
) %>%
set_engine('xgboost', scale_pos_weight = tune(), eval_metric = 'auc') %>%
set_mode('classification')
set.seed(1)
xgb_model_1_res <-
tune_grid(xgb_model_1, diabetes ~., resamples = vfold_cv(df),
grid = tibble(scale_pos_weight = 10^c(-3:-1)))
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
#> flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
#> splice
#>
#> Attaching package: 'vctrs'
#> The following object is masked from 'package:dplyr':
#>
#> data_frame
#> The following object is masked from 'package:tibble':
#>
#> data_frame
#>
#> Attaching package: 'xgboost'
#> The following object is masked from 'package:dplyr':
#>
#> slice
collect_metrics(xgb_model_1_res)
#> # A tibble: 6 x 7
#> scale_pos_weight .metric .estimator mean n std_err .config
#> <dbl> <chr> <chr> <dbl> <int> <dbl> <chr>
#> 1 0.001 accuracy binary 0.349 10 0.0137 Preprocessor1_Model1
#> 2 0.001 roc_auc binary 0.5 10 0 Preprocessor1_Model1
#> 3 0.01 accuracy binary 0.487 10 0.0130 Preprocessor1_Model2
#> 4 0.01 roc_auc binary 0.814 10 0.0132 Preprocessor1_Model2
#> 5 0.1 accuracy binary 0.698 10 0.0112 Preprocessor1_Model3
#> 6 0.1 roc_auc binary 0.798 10 0.0181 Preprocessor1_Model3
Created on 2021-01-19 by the reprex package (v0.3.0)
Thanks for the help! additionally, support for lamba
L2 Regularization on term weights within boost_tree()
would be great
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Feature
I've found the
scale_pos_weight
feature very useful when usingxgb.train()
it would be awesome if this parameter could be tuned using the same syntax asmtry()
or otherboost_tree()
tunable parameters. Thanks!