h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.91k stars 2k forks source link

Support for element-wise grid search in H2OGridSearch #8586

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Hi,

I was wondering if there is a way in H2OGridSearch to search over a pre-defined set of parameters (I'm calling element-wise) rather than Cartesian or RandomDiscrete.

This is possible in sklearn, but does not appear to be possible in H2O. Here is a reprex in sklearn that demonstrates the functionality.

{code:java}

This will search over cartesian combination

import pandas as pd from sklearn import svm, datasets from sklearn.model_selection import GridSearchCV

iris = datasets.load_iris() parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}

svc = svm.SVC(gamma="scale") clf = GridSearchCV(svc, parameters, cv=5, return_train_score=True) clf.fit(iris.data, iris.target) pd.DataFrame(data=clf.cvresults)[['param_C', 'param_kernel']]

param_C param_kernel

0 1 linear

1 1 rbf

2 10 linear

3 10 rbf

However, I can also specify a list of dictionaries to search over user-defined combinations by feeding in list of dictionaries

import pandas as pd from sklearn import svm, datasets from sklearn.model_selection import GridSearchCV

iris = datasets.load_iris() parameters_grid = [{'C': [1], 'kernel': ['linear']}, {'C': [10], 'kernel': ['rbf']}]

svc = svm.SVC(gamma="scale") clf = GridSearchCV(svc, parameters_grid, cv=5, return_train_score=True) clf.fit(iris.data, iris.target) pd.DataFrame(data=clf.cvresults)[['param_C', 'param_kernel']]

param_C param_kernel

0 1 linear

1 10 rbf

{code}

Based upon the grid search documentation here http://docs.h2o.ai/h2o/latest-stable/h2o-docs/grid-search.html#grid-search-in-python this type of element-wise grid searching does not seem possible in H2O.

Using the example in the documentation and considering considering hyper_params = {'ntrees':[1,100], 'learn_rate':[0.1, 0.001]}. I'd ideally like to specify hyper_params and/or search_criteria differently to be able to search over the following combinations.

||ntrees||learn_rate|| |1|0.1| |100|0.001|

Currently, I believe one can only search over the full cartesian with either Cartesian or RandomDiscrete, which would search over the four combinations below instead of only the two above.

||ntrees||learn_rate|| |1|0.1| |1|0.001| |100|0.1| |100|0.001|

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-7054 Assignee: Michal Kurka Reporter: Jason Muhlenkamp State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A