exalate-issue-sync[bot] commented 1 year ago

Wendy Wong commented: MK has fixed the reproducibility issue with GBM across different hardware settings here: [https://h2oai.atlassian.net/browse/PUBDEV-8425|https://h2oai.atlassian.net/browse/PUBDEV-8425|smart-link]

We need to test and make sure this parameter actual performs as we expected it to.

exalate-issue-sync[bot] commented 1 year ago

Wendy Wong commented: 3.35.0.5 or newer

exalate-issue-sync[bot] commented 1 year ago

Wendy Wong commented: Please check with [~accountid:5c355702a217aa69bce55831] for how to setup the environment and what tests to run if need to.

Please check with [~accountid:5f8e6929461cc40075215ee0] on what tests to run.

exalate-issue-sync[bot] commented 1 year ago

Wendy Wong commented: !image-20230208-222652.png|width=784,height=657!

exalate-issue-sync[bot] commented 1 year ago

Wendy Wong commented: Okay, I will propose to run this test to check for reproducibility across different hardware setups:

{noformat}from future import division from builtins import range import sys import h2o from h2o.estimators.gbm import H2OGradientBoostingEstimator import tempfile

helper functions to copy into your notebook

def compare_frame_one_column(f1, f2, tol=1e-6): temp1 = f1.as_data_frame(use_pandas=False) temp2 = f2.as_data_frame(use_pandas=False)

for rowInd in range(1, f1.nrow):
    v1 = float(temp1[rowInd][0])
    v2 = float(temp2[rowInd][0])

    diff = abs(v1 - v2) / max(1.0, abs(v1), abs(v2))
    assert diff <= tol, "Failed frame values check at row {2} and column {3}! frame1 value: {0}, column name: {4}." \
                        " frame2 value: {1}, column name:{5}".format(temp1[rowInd][0], temp2[rowInd][0],
                                                                     rowInd, 0, f1.names[0], f2.names[0])

test starts

fr = h2o.import_file("https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/covtype/covtype.full.csv")

build first model with one hardware configuration

m = H2OGradientBoostingEstimator(seed=1234, score_tree_interval=2) m.train(x=list(range(0, 12)), y="Cover_Type", training_frame=fr) pred = m.predict(fr)

save prediction result for comparison later, remember to give a different name to other hardward setup runs

h2o.download_csv(pred, "/some/dictory/pred.csv")

save model

tmpdir = tempfile.mkdtemp() m_path = m.download_model(tmpdir) # store the model, may need to export model to be accessed by different hardware environment

to compare the predictions from different runs, do this:

pred = h2o.import_file("/path/to/pred.csv") pred2 = h2o.import_file("/path/to/pred2.csv") for index in range(1, pred.ncols): compare_frame_one_column(pred[index], pred2[index])

load model from previous run:

m2 = h2o.load_model(m2_path) # make sure m2_path is accessible in current hardware environment

compare tree structures of both model to make sure they are the same

code by adam valenta

    for ntree in range(ntrees):
         for output_class in ['class_1', 'class_2', 'class_3', 'class_4', 'class_5', 'class_6', 'class_7']:
             tree = H2OTree(model = m, tree_number = ntree, tree_class = output_class)
             tree2 = H2OTree(model = m2, tree_number = ntree, tree_class = output_class)
             assert_list_equals(tree.predictions, tree2.predictions)
             assert_list_equals(tree.thresholds, tree2.thresholds, delta=1e-50) # need to specify delta to check nans
             assert_list_equals(tree.decision_paths, tree2.decision_paths)
             print("Tree", ntree, "class", output_class, "ok"){noformat}

exalate-issue-sync[bot] commented 1 year ago

Wendy Wong commented: If you are checking for reproducibility on the same cluster, you can just do the following:

{noformat}from future import division from builtins import range import sys import h2o from h2o.estimators.gbm import H2OGradientBoostingEstimator import tempfile

helper functions to copy into your notebook

def compare_frame_one_column(f1, f2, tol=1e-6): temp1 = f1.as_data_frame(use_pandas=False) temp2 = f2.as_data_frame(use_pandas=False)

for rowInd in range(1, f1.nrow):
    v1 = float(temp1[rowInd][0])
    v2 = float(temp2[rowInd][0])

    diff = abs(v1 - v2) / max(1.0, abs(v1), abs(v2))
    assert diff <= tol, "Failed frame values check at row {2} and column {3}! frame1 value: {0}, column name: {4}." \
                        " frame2 value: {1}, column name:{5}".format(temp1[rowInd][0], temp2[rowInd][0],
                                                                     rowInd, 0, f1.names[0], f2.names[0])

fr = h2o.import_file("https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/covtype/covtype.full.csv")

build first model with one hardware configuration

m = H2OGradientBoostingEstimator(seed=1234, score_tree_interval=2) m.train(x=list(range(0, 12)), y="Cover_Type", training_frame=fr) pred = m.predict(fr)

h2o.download_csv(pred, "/some/dictory/pred.csv"),

build second model with same hardware configuration

m2 = H2OGradientBoostingEstimator(seed=1234, score_tree_interval=2) m2.train(x=list(range(0, 12)), y="Cover_Type", training_frame=fr) pred2 = m2.predict(fr)

for index in range(1, pred.ncols): compare_frame_one_column(pred[index], pred2[index])

compare tree structures of both model to make sure they are the same

code by adam valenta

    for ntree in range(ntrees):
         for output_class in ['class_1', 'class_2', 'class_3', 'class_4', 'class_5', 'class_6', 'class_7']:
             tree = H2OTree(model = m, tree_number = ntree, tree_class = output_class)
             tree2 = H2OTree(model = m2, tree_number = ntree, tree_class = output_class)
             assert_list_equals(tree.predictions, tree2.predictions)
             assert_list_equals(tree.thresholds, tree2.thresholds, delta=1e-50) # need to specify delta to check nans
             assert_list_equals(tree.decision_paths, tree2.decision_paths)
             print("Tree", ntree, "class", output_class, "ok"){noformat}

exalate-issue-sync[bot] commented 1 year ago

Wendy Wong commented: I set score_tree_interval=1, 2, or 8 in both tests and they still generate different outputs:

!image-20230208-235411.png|width=1516,height=464!

I meant I set both models to have score_tree_interval=1, 2 or 8. They have the same value at the same time.

exalate-issue-sync[bot] commented 1 year ago

Adam Valenta commented: Since there is an issue with variable importance, I ran tests and it seems that the issue is only with variable importance and not with the GBM training, I would focus only on output of the prediction. We can also utilize H2OTree api to check the trees.

Here is a PR: [https://github.com/h2oai/h2o-3/pull/6491|https://github.com/h2oai/h2o-3/pull/6491|smart-link]

h2o-ops commented 1 year ago

JIRA Issue Details

Jira Issue: PUBDEV-8979 Assignee: Arun Aryasomayajula Reporter: Wendy Wong State: Open Fix Version: N/A Attachments: Available (Count: 2) Development PRs: N/A

h2o-ops commented 1 year ago

Attachments From Jira

Attachment Name: image-20230208-222652.png Attached By: Wendy Wong File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8979/image-20230208-222652.png

Attachment Name: image-20230208-235411.png Attached By: Wendy Wong File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8979/image-20230208-235411.png

wendycwong commented 11 months ago

@arunaryasomayajula : Any updates on this one?

h2oai / h2o-3

Setup test environment to make sure GBM reproducibility across same hardware setup using yarn #6517

helper functions to copy into your notebook

test starts

build first model with one hardware configuration

save prediction result for comparison later, remember to give a different name to other hardward setup runs

save model

to compare the predictions from different runs, do this:

load model from previous run:

compare tree structures of both model to make sure they are the same

code by adam valenta

helper functions to copy into your notebook

build first model with one hardware configuration

h2o.download_csv(pred, "/some/dictory/pred.csv"),

build second model with same hardware configuration

compare tree structures of both model to make sure they are the same

code by adam valenta