Claim 3.2 Reduces explanation (SHAP) variance (Fig. 6)

In models with many features, post-hoc interpretations such as SHAP values (Lundberg & Lee, 2017) can be used to understand how the model makes its predictions. Fig 6 shows that HS improves the stability of SHAP values with respect to resampling of the dataset. In this experiment, we randomly choose 50 samples in the breast-cancer dataset to hold out, and for each of 100 iterations, we randomly select two thirds of the remaining samples and train an RF on this reduced dataset. For each held-out sample, we measure the variance of its SHAP values per feature across the 100 iterations. We then average the variance per feature across all 50 held-out samples, with these values plotted in Fig 6 for RF with HS and without. We observe that the variances of the SHAP values for RF with HS are substantially smaller than those for RF without HS. Moreover, these improvements in stability persist even for datasets such as Heart, Diabetes, and Ionosphere, for which HS does not greatly improve prediction performance (see SHAP stability plots for all datasets in Fig S13 and Fig S14). When SHAP values are more stable, we can have more faith that they reflect true rather than spurious patterns in the data generating process.

do8572 / MLDS

Claim 3.2 Reduces explanation (SHAP) variance (Fig. 6) #12