dmlc / treelite

Universal model exchange and serialization format for decision tree forests
https://treelite.readthedocs.io/en/latest/
Apache License 2.0
730 stars 98 forks source link

Support boosting from the average in sckit-learn #446

Closed hcho3 closed 11 months ago

hcho3 commented 1 year ago

Scikit-learn's gradient boosting algorithm performs "boosting from the average," where a simple base estimator is fitted from the distribution of class labels (or the average label, if regression is used) and is set as the initial learner in the ensemble model. Boosting from the average speeds up convergence. See https://github.com/dmlc/xgboost/issues/4321 and https://lightgbm.readthedocs.io/en/latest/Parameters.html#boost_from_average

Currently, Treelite throws an error if init="zero" is omitted when building GradientBoostingClassifier / GradientBoostingRegressor objects. We should remove this restriction, in order to support boosting from the average