h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.94k stars 2k forks source link

AutoML/Random Grid Search: weight for (categorical only?) hyperparameters #8348

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

from {{XGBoostSteps}}:

{noformat}searchParams.put("_booster", new XGBoostParameters.Booster[]{ //gblinear crashes currently XGBoostParameters.Booster.gbtree, //default, let's use it more often XGBoostParameters.Booster.gbtree, XGBoostParameters.Booster.dart });{noformat}

however, random search keeps track of indices combinations, not of parameters combinations... so that with the config above, it can produce twice the same model. I guess the harm is not big today, as the hyperspace is large, but we need to reconsider this: I wish there was a way to give weights to some param, or maybe have a separate grid…

Need to figure out if it's worth having a full grid with similar search space for each booster though.

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-7289 Assignee: Sebastien Poirier Reporter: Sebastien Poirier State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A