prefit xgboost model in `xrf_fit()`, allow early stopping

simonpcouch commented 2 years ago

This PR:

Adds support for early stopping with xrf::xrf()
Corrects routing of mtry from colsample_bytree to colsample_bynode (and is now thus consistent with parsnip, see tidymodels/parsnip#495)
De-duplicates xgboost training code between parsnip and rules
Extends testing for xrf engine and removes mtry tests that are now redundant with parsnip

It does so by making use of the prefit_xgb argument to xrf::xrf()—instead of having xrf handle the xgboost training, we use parsnip::xgb_train and pass the pre-fit xgboost model to xrf.

Justification for pre-fitting

The pre-fitting makes some parts of wrapping easier for us, but was initially a need for early stopping: `xrf::xrf.formula()` wraps both `xgboost::xgb.train()` and `glmnet::glmnet()`. `xrf::xrf.formula()` takes in an `xgb_control` argument, where all arguments passed to `xgb_control` are passed to `xgboost::xbg.train()`'s `param` argument—this is currently the only way one can pass arguments to `xgboost::xgb.train()` through `xrf::xrf.formula()`. This is an issue for early stopping, as `early_stopping_rounds` is a _main_ argument to `xgboost::xgb.train()` and cannot be passed through `param`. Thus, if we wanted to implement an interface for early stopping, we'd either: * need to contribute to the xrf package to allow for passing arguments to `xrf::xrf.formula()` as main arguments to `xgboost::xbg.train()` * "prefit" the xgboost model with our own machinery and pass it as `prefit_xgb` to `xrf::xrf.formula()`.

Related to tidymodels/parsnip#749!

simonpcouch commented 2 years ago

Ah, figured it out🙈 Wrapping in suppressMessages pending https://github.com/holub008/xrf/pull/21 merged + on CRAN.

github-actions[bot] commented 2 years ago

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

tidymodels / rules

prefit xgboost model in `xrf_fit()`, allow early stopping #60