h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.94k stars 2k forks source link

Add Stacked Ensemble MOJO limitation to User Guide #8598

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

We should update the [production section|http://docs.h2o.ai/h2o/latest-stable/h2o-docs/productionizing.html] of the user guide to include what functionality is currently available for the stacked ensemble mojo.

Currently the docs list (under the Notes section [here|http://docs.h2o.ai/h2o/latest-stable/h2o-docs/productionizing.html#what-is-a-mojo]:

MOJOs are supported for AutoML, Deep Learning, DRF, GBM, GLM, GLRM, K-Means, Stacked Ensembles, SVM, Word2vec, and XGBoost models.

But the section doesn't specify what is and isn't available for the MOJO implementations.

exalate-issue-sync[bot] commented 1 year ago

Erin LeDell commented: Maybe also good to make a note on the Stacked Ensemble and AutoML user guide pages about the SE MOJO limitation (though I think it’s going to be fixed soon - there was a PR to make DL thread-safe so now it can have a MOJO and hence SE made with DLs will have MOJOs that can be imported – currently you can create them but not load them back in).

exalate-issue-sync[bot] commented 1 year ago

Ben Epstein commented: @Erin Ledell The documentation states support for SVM but in my testing it doesn’t seem to have_mojo or have_pojo:

{code:python}import h2o from h2o.estimators import H2OSupportVectorMachineEstimator splice = h2o.import_file("http://h2o-public-test-data.s3.amazonaws.com/smalldata/splice/splice.svm") svm = H2OSupportVectorMachineEstimator(gamma=0.01, rank_ratio=0.1, disable_training_metrics=False) svm.train(y="C1", training_frame=splice) print print(svm.have_pojo) print(svm.have_mojo) print(h2o.version)

False False '3.28.1.2'{code}

I noticed similarly with {{H2ONaiveBayesEstimator}} POJO is supported but MOJO isn’t.

I’m looking here [https://github.com/h2oai/h2o-3/blob/master/h2o-py/h2o/estimators/estimator_base.py#L382-L392|https://github.com/h2oai/h2o-3/blob/master/h2o-py/h2o/estimators/estimator_base.py#L382-L392|smart-link]

{noformat}if (model_json["algo"]=="glm") and self.HGLM: m._have_pojo = False m._have_mojo = False else: m._have_pojo = model_json.get('have_pojo', True) m._have_mojo = model_json.get('have_mojo', True){noformat}

But not yet seeing where the have_mojo and have_pojo attributes are being set.

Is this a bug in documentation or code implementation? Thank you!

exalate-issue-sync[bot] commented 1 year ago

Angela Bartz commented: [~accountid:557058:168c187c-bd44-429d-8971-7a1f7e2ce2f6] I created a separate Jira to track this. Please see https://0xdata.atlassian.net/browse/PUBDEV-7429

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-7041 Assignee: Angela Bartz Reporter: Lauren DiPerna State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A