h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.89k stars 2k forks source link

H2OGridSearch.train() should default to all columns, like H2OEstimator.train() and H2OAutoML.train() #12144

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Our default behavior for all estimators (as well as AutoML) is that if x is not specified, it will assume x = "all columns in the training_frame other than y". e.g.

{code:python} rf = H2ORandomForestEstimator(model_id="rf", ntrees=200) rf.train(y=y_col, training_frame=train_hex, validation_frame=valid_hex) {code}

However, H2OGridSearch explicitly requires x. This should be changed for consistency and usability.

{code:python} gs1 = H2OGridSearch(H2OGeneralizedLinearEstimator(family='binomial', nfolds=2, fold_assignment="modulo", keep_cross_validation_predictions=True), hyper_parameters, search_criteria=criteria)

gs1.train(y=y_col, training_frame=train_hex, validation_frame=valid_hex)

----> 7 gs1.train(y=y_col, training_frame=train_hex, validation_frame=valid_hex) 8 auc_glm = gs1.auc(valid=True)

TypeError: train() takes at least 2 arguments (4 given) {code}

hasithjp commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-5272 Assignee: New H2O Bugs Reporter: Ben Campbell State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A