h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.92k stars 2k forks source link

Include the 'weight' of each tree in the model object of h2o xgboost #15885

Open flippercy opened 1 year ago

flippercy commented 1 year ago

In the POJO of h2o xgboost, I noticed that actually there is a coefficient acting as the 'weight' of each tree. For example, the POJO of a model looks like:

class XGBoost_model_R_160218545645_1_Tree_g_0_t_0 { static float score0(double[] data) { return 0.3106898f * ((Double.isNaN(data[141]) || ((float)data[141]) >= 540.0f) .....

class XGBoost_model_R_160218545645_1_Tree_g_0_t_1 { static float score0(double[] data) { return 0.31716052f * ((Double.isNaN(data[141]) || ((float)data[141]) >= 553.0f) ....

However, in the model object itself, I cannot find the "coefficients", 0.3106898 and 0.31716052, anywhere.

Could you add this parameter into the model object? Or if it is already there, could you show me how to find it?

Thank you.

flippercy commented 1 year ago

@mn-mikke @hannah-tillman @ledell:

In addition, I am also wondering the reason h2o builds xgboost models this way. Shouldn't all the trees in xgboost models be equal with no weight?

mn-mikke commented 1 year ago

cc @valenad1 @wendycwong

flippercy commented 1 year ago

@mn-mikke @valenad1 @wendycwong Any updates team?