h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.81k stars 1.99k forks source link

shap package in python couldn't explain stack model of h2o #15863

Open feihongloveworld opened 8 months ago

feihongloveworld commented 8 months ago

image

mn-mikke commented 8 months ago

Hi @feihongloveworld, what algorithm did you use for training your model? see the error message:

Predict feature contributions - SHAP values on an H2O Model (only GBM, XGBoost, DRF models and equivalent imported MOJOs)

cc @tomasfryda

feihongloveworld commented 8 months ago

stack of bestfamily in the automl

发自我的iPhone

------------------ Original ------------------ From: Marek Novotný @.> Date: Tue,Oct 24,2023 9:15 PM To: h2oai/h2o-3 @.> Cc: hong.fei @.>, Mention @.> Subject: Re: [h2oai/h2o-3] shap package in python couldn't explain stack modelof h2o (Issue #15863)

Hi @feihongloveworld, what algorithm did you use for training your model?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

feihongloveworld commented 8 months ago

Hi @feihongloveworld, what algorithm did you use for training your model? see the error message:

Predict feature contributions - SHAP values on an H2O Model (only GBM, XGBoost, DRF models and equivalent imported MOJOs)

cc @tomasfryda

but the shap value is a model agnostic metric,

mn-mikke commented 8 months ago

stack of bestfamily in the automl

The latest version 3.44.0.1 should contain support for SHAP values on stacked ensemble models. @tomasfryda can tell more about the details and if there are any limitations.

feihongloveworld commented 8 months ago

@tomasfryda can you tell me how to get the shap information of a stack model from the automl on test data? I really need it right now. Waiting for your reply online.

tomasfryda commented 8 months ago

@feihongloveworld Sorry for the delayed response, I was on vacation.

You need to provide a background_frame parameter to be able to calculate the shap value for all except tree-based algos.

For example:

# Plot SHAP contributions for one instance (e.g., row 5):
model.shap_explain_row_plot(prostate_test,
                            row_index=5,
                            background_frame=prostate_train[prostate_train["AGE"] > 70, :])

See the Marginal SHAP documentation for more details.