h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.87k stars 1.99k forks source link

small Rosettas for Python, scikit-learn, R, Flow, Sparkling Water for the website #14859

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Tiny examples in each language for each algo, for the website.

exalate-issue-sync[bot] commented 1 year ago

Neeraja Madabhushi commented: We need list of missing Rosettas for all clients.

exalate-issue-sync[bot] commented 1 year ago

J commented: More info for when there is time to work on this:

exalate-issue-sync[bot] commented 1 year ago

J commented: Preliminary list of languages:

Preliminary list of functions for Rosettas:

[~accountid:557058:3ae3c86a-e56a-4211-99d4-9a8cf5ab63f6], can you please review and update as needed? I'm sure there are other functions that should be included here, but I thought this would be a good starting point. Thanks!

exalate-issue-sync[bot] commented 1 year ago

Raymond Peck commented: The overarching goal is for this to be:

It should also link to the reference docs, where appropriate.

For "build model":

Add grid search.

Add data conversion (R data frame and data.table <-> H2O, Python Pandas / numpy / raw 2d arrays <-> H2O, RDD <-> H2O).

Import Files needs to include single file case + directory case.

Split frame should have both random (seeded) runif, plus row slice.

Add row and column slicing by name and index.

User row weights example.

Add examples for handling of unbalanced datasets both using Balance Classes and using user row weights.

Add examples of using GLRM and PCA to reduce dimensionality to feed into other model builders.

DinukaH2O commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-1898 Assignee: Joby Joy Reporter: Raymond Peck State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A