H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
The example described at [https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/calibration_frame.html|https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/calibration_frame.html] works as expected. However, if you save the model as a mojo and the prediction is made, the calibrated probabilities disappear from the output.
{code:python}import h2o from h2o.estimators.gbm import H2OGradientBoostingEstimator h2o.init()
Import the ecology dataset
ecology = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/gbm_test/ecology_model.csv")
Convert response column to a factor
ecology['Angaus'] = ecology['Angaus'].asfactor()
Set the predictors and the response column name
response = 'Angaus' predictors = ecology.columns[3:13]
Split into train and calibration sets
train, calib = ecology.split_frame(seed = 12354)
Introduce a weight column (artificial non-constant) ONLY to the train set (NOT the calibration one)
w = h2o.create_frame(binary_fraction=1, binary_ones_fraction=0.5, missing_fraction=0, rows=744, cols=1) w.set_names(["weight"]) train = train.cbind(w)
Train an H2O GBM Model with Calibration
ecology_gbm = H2OGradientBoostingEstimator(ntrees = 10, max_depth = 5, min_rows = 10, learn_rate = 0.1, distribution = "multinomial", calibrate_model = True, calibration_frame = calib) ecology_gbm.train(x = predictors, y = "Angaus", training_frame = train, weights_column = "weight")
predicted = ecology_gbm.predict(train)
View the calibrated predictions appended to the original predictions
predicted predict p0 p1 cal_p0 cal_p1
[744 rows x 5 columns]
If we now save the model as mojo and repeat the same operation:
my_save_mojo = ecology_gbm.save_mojo("", force=True) mojo_model = h2o.import_mojo(my_save_mojo) mojo_predicted = mojo_model .predict(train)
The calibrated predictions are not appended to the original predictions
predicted
predict p0 p1
[744 rows x 3 columns]{code}