interpretml / interpret

Fit interpretable models. Explain blackbox machine learning.
https://interpret.ml/docs
MIT License
6.3k stars 731 forks source link

Questions on global and local explanations #409

Closed naveen-marthala closed 1 year ago

naveen-marthala commented 1 year ago
Here's the code to reproduce: ``` ## imports import numpy as np import pandas as pd from sklearn.metrics import get_scorer from sklearn import datasets from interpret.glassbox import ExplainableBoostingClassifier, ExplainableBoostingRegressor from interpret import show ## getting data regression_data_X, regression_data_y = datasets.fetch_california_housing(return_X_y=True, as_frame=True) regression_data_X.drop(columns=['Latitude', 'Longitude'], axis=1, inplace=True) binary_data_X, binary_data_y = datasets.fetch_openml(data_id=43898, target_column='class', return_X_y=True, as_frame=True) multiclass_data1_X, multiclass_data1_y = datasets.load_iris(return_X_y=True, as_frame=True) multiclass_data2_X, multiclass_data2_y = datasets.fetch_openml(data_id=42, target_column='class', return_X_y=True, as_frame=True) ## training regression_model = ExplainableBoostingRegressor(max_rounds=1_000, n_jobs=-1, random_state=200) regression_model.fit(regression_data_X, regression_data_y) binary_model = ExplainableBoostingClassifier(max_rounds=1_000, n_jobs=-1, random_state=200) binary_model.fit(binary_data_X, binary_data_y) multiclass_model1 = ExplainableBoostingClassifier(max_rounds=1_000, n_jobs=-1, random_state=200) multiclass_model1.fit(multiclass_data1_X, multiclass_data1_y) multiclass_model2 = ExplainableBoostingClassifier(max_rounds=1_000, n_jobs=-1, random_state=200) multiclass_model2.fit(multiclass_data2_X, multiclass_data2_y) ## global-explanations regression_global_expln = regression_model.explain_global() show(regression_global_expln) ## local-explanations _local_expln = regression_model.explain_local(X=regression_data_X.loc[70:72,:], y=regression_data_y[70:73]) show(_local_expln) ```

Questions/Confrimations:

On Global Explanations:

  1. Below images are from the summary tab of the global explanation for regression_model and binary_model. image image Plot-titles of both of them say 'Mean Absolute Score (Weighted)'. How was MAE used for classification? and How to interpret the numbers that show up on hover?
  2. This is the explanation of capital_gain variable of binary_model (zoomed): image what does the score on y-axis mean for a continuous variable? and how to interpret the lower, upper and main confidence intervals? and what does the width of confidence interval signify? And in general, what does ECG-like curve, instead of a relatively smooth line signify for the explanation of a continuous variable?
  3. This is the explanation of workclass variable of binary_model model: image what does score on y-axis mean for a categorical variable? what does the longer main bars(blue colored) signify and vice versa? what does it mean when a category have small main bar but longer confidence intervals?
  4. the below image is the explanation of petal length (cm) variable of multiclass_model1 model: image and the below image is the explanation of precip variable of multiclass_model2 model: image each of the picture shows explanations of continuous and categorical variable respectively and the explanation are split by classes in the target variable. this split-by-class explanation could not be seen for the binary target. is there a way I could get it?
  5. this is the explanation of an interaction feature: image are x,y that show up on hover values of columns seen during training? and z the score?

On Local Explanations:

  1. this is the local explanation for the 70th sample of multiclass_data1_X with multiclass_model1: image what are these values that show up on hover on each class? I couldn't find them with predict_and_contrib too. is there a way to get these numbers instead of having to look at the visualisation?
  2. local explnations for a binary classifier show up the same way it does for a regressor, contribution to the prediction. but, local explanations for multi-class model shows up the way shown in above point. is there a way to make the local-explanations of a binary-classifer show up the same way as multi-class, split-by-classes?
  3. >>> _temp = multiclass_model1.predict_and_contrib(X=multiclass_data1_X.loc[70:70,:], output='probabilities')
    (array([[0.00974369, 0.50058385, 0.48967246]]),
    array([[[-0.36102402,  0.50983644,  0.0062549 ],
         [ 0.42428517,  0.30131501, -0.31829258],
         [-0.81430566,  1.17133876, -0.14659327],
         [-1.32729048, -0.84675579,  1.51467677]]]))

    in the above output, first array has probabilites(softmax) and 2nd array has logits, correct? and why is the 2nd array of 3x4 shape, and not 4x1 for 4 features in X?

Thank you.

paulbkoch commented 1 year ago

Hi @naveen-marthala -- 

  1. Mean Absolute Score (Weighted) isn't quite what you think in this context.  If you're familiar with traditional GAMs, each feature is expressed as a "smooth function".  We use the same concept, but since our functions are not smooth, we just call the values on the graphs as "scores" because there doesn't seem to be any other commonly accepted term that fits our model.  For classification, the scores will be logits.  So, knowing that, "Mean Absolute Score (Weighted)" could be translated to "Mean Absolute logits (Weighted)".  To calculate that number we lookup the logit contribution for each feature for each sample and take the absolute value of the logit, then we average those absolute values across all samples for each feature.

  2. For classification, the score on the y-axis are logits.  The error bars are not true confidence intervals.  During construction we create bagged EBM models internally, and then take the average of the bagged models for each bin.  The error bars are the standard deviation of the scores across the bagged models.  They are not perfect, but they do give you a general sense for the uncertainty within regions.  The non-smoothness of graphs is an area that requires some interpretation generally.  Some possibilities are that the function itself should not be smooth and your dataset really does include these jumpy artifacts.  Another possibility is that the model was overfit.  In the case of your graphs, I suspect that the jumpiness is an actual property of the data given the error bars are fairly small in several of the places where the graphs make their jumps.  This would be a good thing to investigate in more detail.  What is happening for instance at the jump at 3k?  Are there imputed missing values in this dataset for example and the mean value is 3k?  An investigation will probably turn up something in that region since +2 on a logit scale is a fairly significant value.  From the histogram it looks like data above 4k is fairly rare, but the graph in that region still has fairly small error bars, so I would think those are real manifestations as well.  It looks like there are regions with +5 in logits there, which is really significant.

  3. Categoricals are again expressed as logits for classification.  And the error bars have the same definition as for continuous: they are the standard deviation between the internally bagged models.

  4. We don't currently support converting multiclass models into binary per-class models.  It's something we're looking into.  For now if you need binary models, then you could retrain these as one vs rest.

  5. x is age.  y is capital_loss.  z is the score, which again is a logit for classification. 

  6. These are multi-class logits. 

  7. No, we do not support visualizing a binary classification as multiclass.  If you wanted to do so, you would just half the value shown for binary classification, negate it, and that would be the logit for the class in index 0.  A value half as much as the value shown on the graph would then be the logit for the class in index 1. 

  8. The outermost index is for samples, but you only have 1, so there's only one item at that index level. The next index (x4) is for the number of features.  To get 3 probabilities you must have 3 logits for a 3-class problem to pass to softmax.  Yes, it is possible to zero one of the logits reliably, but scikit-learn estimators make the number of logits equal to the number of classes, so we do too. To calculate the probabilities from the per-feature logits, you would sum the values in each column to get the per-class logits (3 of them), then take the softmax.

naveen-marthala commented 1 year ago

Huge thanks for your time and answering my questions @paulbkoch.

  1. These are multi-class logits.

I was under the assumption that the logits for each class are coming right before passing them to softmax. So, how are multiple logits available for the Intercept?

paulbkoch commented 1 year ago

Hi @naveen-marthala -- If you trained an EBM that had zero features, the intercept would essentially be the base rate. The base rate for a 10 class problem needs to have a minimum of 9 logits (but we use 10) in order to generate the 10 probabilities.

import scipy
intercept=[1.0,2.0,3.0]
scipy.special.softmax(intercept)

returns: array([0.09003057, 0.24472847, 0.66524096])