h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.91k stars 2k forks source link

Grid Search in Flow doesn't Match Grid Search in R/Python #15430

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

For the [gbmTuning.flow|https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/product/flow/packs/examples/GBM_TuningGuide.flow] notebook compared to the R [gbmTuning.RD|https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/product/tutorials/gbm/gbmTuning.Rmd] or Python [gbmTuning.ipynb | https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/product/tutorials/gbm/gbmTuning.ipynb] guides:

when you run grid search with {code} strategy = RandomDiscrete {code} Flow does not return the same set of models as Python or R (which match up), it also doesn't produce the same results with multiple runs even if a seed is set and h2o is shutdown and started up again.

Log files from a fresh run of flow and python are attached as well as screen shots from the max_depth grid search and the final grid search.

original jira ticket: https://0xdata.atlassian.net/browse/PUBDEV-2976

hasithjp commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-3213 Assignee: UNASSIGNED Reporter: Lauren DiPerna State: Open Fix Version: N/A Attachments: Available (Count: 7) Development PRs: N/A

Attachments From Jira

Attachment Name: flow_final_grid_search.png Attached By: Lauren DiPerna File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-3213/flow_final_grid_search.png

Attachment Name: flow_logs.zip Attached By: Lauren DiPerna File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-3213/flow_logs.zip

Attachment Name: flow_max_depth_grid.png Attached By: Lauren DiPerna File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-3213/flow_max_depth_grid.png

Attachment Name: python_auc_final_grid_search.png Attached By: Lauren DiPerna File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-3213/python_auc_final_grid_search.png

Attachment Name: python_logs.zip Attached By: Lauren DiPerna File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-3213/python_logs.zip

Attachment Name: python_max_depth_grid.png Attached By: Lauren DiPerna File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-3213/python_max_depth_grid.png

Attachment Name: python_model_ids_final_grid_search.png Attached By: Lauren DiPerna File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-3213/python_model_ids_final_grid_search.png