h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.92k stars 2k forks source link

Track model accuracy and speed over time for nightly builds #9991

Closed exalate-issue-sync[bot] closed 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Track the model accuracy (AUC/logloss/deviance/accuracy/F1) and model building time on a single-node server for every nightly build, for a set of public datasets (Airline, KDDCup, Kaggle: Paribas/Santander/Telstra/Tradeshift/AfricanSoil/Higgs).

Suggest a new Jenkins job for that +SQL database for metrics recording.

One set of scripts should use default parameters, and one can use mode expert settings.

Consult with [~accountid:557058:3bc534f4-c129-4d5f-b8c1-5a69d34942ee] or [~accountid:557058:3402c6e3-c528-4a01-8b6b-85a92dd2a5f8] for Kaggle datasets and tuning parameters.

exalate-issue-sync[bot] commented 1 year ago

Bill Gallmeister commented: Navdeep, this would be a great project for Nikhil--also very very important. We'd like to have a running graph of "model performance on dataset over time", run as a CI task, with results stored in a database so we know how changes are affecting model performance over time.

exalate-issue-sync[bot] commented 1 year ago

Navdeep commented: we should have this already in our accuracy test suite: https://github.com/h2oai/h2o-3/tree/master/h2o-test-accuracy

exalate-issue-sync[bot] commented 1 year ago

Navdeep commented: Moved here: https://0xdata.atlassian.net/browse/MLBENCH-1

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-3068 Assignee: Navdeep Reporter: Arno Candel State: Closed Fix Version: N/A Attachments: N/A Development PRs: N/A