Support boosting from the average in sckit-learn

Scikit-learn's gradient boosting algorithm performs "boosting from the average," where a simple base estimator is fitted from the distribution of class labels (or the average label, if regression is used) and is set as the initial learner in the ensemble model. Boosting from the average speeds up convergence. See https://github.com/dmlc/xgboost/issues/4321 and https://lightgbm.readthedocs.io/en/latest/Parameters.html#boost_from_average

Currently, Treelite throws an error if init="zero" is omitted when building GradientBoostingClassifier / GradientBoostingRegressor objects. We should remove this restriction, in order to support boosting from the average

dmlc / treelite

Support boosting from the average in sckit-learn #446