On H2O 3.30.0.1 if a binary model is saved with a validation dataset the actual value for the validation dataset key is not stored (this is also the case for the training dataset). The expectation, though, is that you could get the validation frame key from your binary model even if you are starting a fresh cluster.

To Reproduce:

Build and save a binary model {code} import h2o from h2o.estimators.gbm import H2OGradientBoostingEstimator h2o.init()

import the covtype dataset:

this dataset is used to classify the correct forest cover type

original dataset can be found at https://archive.ics.uci.edu/ml/datasets/Covertype

covtype = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/covtype/covtype.20k.data")

convert response column to a factor

covtype[54] = covtype[54].asfactor()

set the predictor names and the response column name

predictors = covtype.columns[0:54] response = 'C55'

split into train and validation sets

train, valid = covtype.split_frame(ratios = [.8], seed = 1234)

try using the balance_classes parameter (set to True):

model = H2OGradientBoostingEstimator(balance_classes = True, seed = 1234) model.train(x = predictors, y = response, training_frame = train, validation_frame = valid)

binary_path = h2o.save_model(model=model) print(binary_path) {code}

shutdown cluster, start a new cluster, and load binary model {code} h2o.cluster().shutdown() h2o.init() model = h2o.load_model(path=binary_path)

either of these functions will return None for the validation frame (and likewise the training frame)

model.actual_params["validation_frame"] model._model_json["parameters"][2]['actual_value'] {code}

h2oai / h2o-3

Binary Model Doesn't Show Train and Valid Frame Keys #7707