stanfordmlgroup / ngboost

Natural Gradient Boosting for Probabilistic Prediction
Apache License 2.0
1.65k stars 217 forks source link

NGBoost Algorithm problem with shap.TreeExplainer #167

Closed gerileka closed 4 years ago

gerileka commented 4 years ago

""" Trying the new library for probabilistic regression and classifcation. Minor problems when using the Shap explainer

"""

SHAP plot for loc tree

import shap shap.initjs() explainer = shap.TreeExplainer(ngb, X, model_output=0) shap_values = explainer.shap_values(X) shap.dependence_plot(0, shap_values, X)

Gives the following Error


Exception Traceback (most recent call last)

in 3 4 ## SHAP plot for loc trees ----> 5 explainer = shap.TreeExplainer(ngb, X, model_output=0) 6 shap_values = explainer.shap_values(X) 7 shap.dependence_plot(0, shap_values, X) ~\AppData\Local\Continuum\anaconda3\lib\site-packages\shap\explainers\tree.py in __init__(self, model, data, model_output, feature_dependence) 87 self.feature_dependence = feature_dependence 88 self.expected_value = None ---> 89 self.model = TreeEnsemble(model, self.data, self.data_missing) 90 91 assert feature_dependence in feature_dependence_codes, "Invalid feature_dependence option!" ~\AppData\Local\Continuum\anaconda3\lib\site-packages\shap\explainers\tree.py in __init__(self, model, data, data_missing) 708 self.tree_output = "probability" 709 else: --> 710 raise Exception("Model type not yet supported by TreeExplainer: " + str(type(model))) 711 712 # build a dense numpy version of all the tree objects Exception: Model type not yet supported by TreeExplainer:
ryan-wolbeck commented 4 years ago

@gerileka My guess is the base learner you used isn't compatible. Can you share what that learner is?

gerileka commented 4 years ago

@ryan-wolbeck hi, thank you for your interests. I used the stock version from the user guide website, without change.

This is my full code

from ngboost import NGBRegressor

from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error

X, Y = load_boston(True) X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2)

ngb = NGBRegressor().fit(X_train, Y_train) Y_preds = ngb.predict(X_test) Y_dists = ngb.pred_dist(X_test)

Feature importance for loc trees

feature_importance_loc = ngb.featureimportances[0]

Feature importance for scale trees

feature_importance_scale = ngb.featureimportances[1]

import shap shap.initjs() explainer = shap.TreeExplainer(ngb, model_output=0)

ryan-wolbeck commented 4 years ago

I spun up a colab notebook and was able to run it without error: image

gerileka commented 4 years ago

@ryan-wolbeck Thank you. I tried as well and I think that is my version of shap. Installed it like last year, maybe I need an update on the library.