h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.86k stars 2k forks source link

Add Shapely value based feature importance to Flow #8886

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Calculate SHAP values in Flow for GBM and XGBoost algorithms. Use average mean value for each feature to calculate and display feature importance, similar to what is done in the cell 9 of the following notebook:

https://github.com/h2oai/h2o-3/blob/master/h2o-py/demos/shap_values_gbm.ipynb

In terms of displaying feature importance, perhaps use a different tab in conjunction with the current feature importance graph:

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-6747 Assignee: New H2O Bugs Reporter: Bojan Tunguz State: Open Fix Version: N/A Attachments: Available (Count: 2) Development PRs: N/A

Attachments From Jira

Attachment Name: download.png Attached By: Bojan Tunguz File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-6747/download.png

Attachment Name: Flow_VariableImportances.png Attached By: Bojan Tunguz File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-6747/Flow_VariableImportances.png