Hi, first of all, thanks for the great package. I am unsure if this is an issue or if i am misunderstanding the colorbar on the summary plot.

I have a classifier using Catboost, with a SHAP explainer as shown below:

model=cb.CatBoostClassifier (iterations=100, depth=8, learning_rate=0.1, loss_function='Logloss')
model.fit(X_train, y_train,cat_features=categorical_features_indices,eval_set=(X_test, y_test),plot=False)
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(cb.Pool(X_train,y_train,cat_features=categorical_features_indices))

I create a dependence plot for a single variable using filtered dataframes so there was only one variable to scale the colorbar by:

var_to_plot='TestVar' # (an int64 variable)
idx=X_train.columns.get_loc(var_to_plot)
shap_plot=shap_values[:,idx]
Xtrain_plot=X_train[var_to_plot]
shap.dependence_plot(var_to_plot, shap_plot[:, None], X_train[var_to_plot].to_frame())

The dependence plot using the line below is identical to the one obtained using the line above

shap.dependence_plot(var_to_plot, shap_values, X_train,interaction_index="TestVar")

However, when I compare the dependence plot to the summary plot, the summary plot suggests the largest SHAP values of around 0.8 occur for the highest values of TestVar, but the dependence plot shows that SHAP values of around 0.8 occur for the lower values of TestVar. The color scale on both plots seems inconsistent.

For reference, I am on SHAP 0.28.5 and matplotlib 2.1.1

shap / shap

Colorbar feature scale mismatch between dependence plot and summary plot #528

The dependence plot using the line below is identical to the one obtained using the line above

shap.dependence_plot(var_to_plot, shap_values, X_train,interaction_index="TestVar")