marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.42k stars 1.79k forks source link

Lime Submodular_pick with keras #617

Closed ForgeFin closed 3 years ago

ForgeFin commented 3 years ago

I have a straightforward binary sentiment classification tasks (class labels 0 and 1). I train a keras network and run the LimeTextExplainer on a random document (document[idx]). Since keras has no predict_proba method but LimeTextExplainer requires this function, I created my own:

def predict_prob(string):
    ''' must take list of d strings and output (d,k) numpy array
        with prediction probabilities, where k is the number of classes
    '''
    x_temp = count.transform(np.array(string)) ## transform string
    prediction = model.predict(convert_sparse_matrix_to_sparse_tensor(x_temp))
    class_zero = 1-prediction
    probability= np.append(class_zero, prediction, axis=1)

    return probability ## array [1-p, p]

(I understand the output of the text explainer as follows: the occurance of words with negative values result more likely in class 0 and positive values result in class 1.)

from lime.lime_text import LimeTextExplainer
from lime import lime_text

explainer = LimeTextExplainer(class_names=['negative','positive'])
ex = explainer.explain_instance(document[idx], predict_prob,
                                num_features=10)

from collections import OrderedDict
weights = OrderedDict(ex.as_list())
lime_weights = pd.DataFrame({'words': list(weights.keys()), 'weights': list(weights.values())})

However, what I don't really understand is the output of SubmodularPick. First, it gives me a key error for one of the explainers, and second, the output of the one that works is exactly the same as explain instance? What am I doing wrong here?

from lime import submodular_pick
sp_obj = submodular_pick.SubmodularPick(explainer, document, predict_prob, sample_size=10, num_features=feature_n, num_exps_desired=2)

How do I retrieve the most relevant features (globally)? Is it sp_obj.explanations or sp_obj.sp_explanations? What is the difference?

Running W_matrix = pd.DataFrame([dict(this.as_list()) for this in sp_obj.explanations]) results for me in an error ...

  File "C:\Users\Anaconda3\envs\pyGPU\lib\site-packages\lime\explanation.py", line 141, in as_list
    ans = self.domain_mapper.map_exp_ids(self.local_exp[label_to_use], **kwargs)

KeyError: 1

On the other hand, sp_obj.sp_explanations works, but gives me the same output as explainer.explain_instance ..

W_pick=pd.DataFrame([dict(this.as_list(this.available_labels()[0])) for this in sp_obj.sp_explanations]).fillna(0)

marcotcr commented 3 years ago
ForgeFin commented 3 years ago
  • The prediction probability function should take a list as input, and output a 2d array of prediction probabilities. It seems that yours takes in a single string?

Ok thanks, that's clear now. Indeed, it takes a list of strings.

  • sp_obj.sp_explanations will give you a list of explanations for inspection, which should contain diverse explanations with important features. It does not give you a ranked list of feature importances.

Hm, that's what I don't understand.. Shouldn't SP-Lime be able to return the global feature importances? As far as I understand, SP-Lime returns a set V of instances B that are representative for the classification task, i.e., that cover the most important features.

For example, assuming we have 3 instances with 4 features each and let's say instance 1 and 2 share feature 1, which is the only shared feature among all 3 instances. Further, let's say instance 2 has 3 out of 4 features present and the other instances only have less features present. If we select the one most representative instance, it would be obviously instance 2. But doesnt the importance function return the most important feature among set B? So shouldnt it be possible to rank the features with the importance function?

Also, if I run SP-lime only for instances which belong to the same class, shouldnt it be possible to say how the features affect my classification task globally?

Btw. the coverage function seems odd to me.. I thought the coverage function simply counts the importance scores of all instances, depending on the presence of the feature. The implementation I do not understand.. could you please elaborate? Eq. 3 in the paper is quite difficult to read - for me at least..

current = np.dot(
        (np.sum(abs(W)[V + [i]], axis=0) > 0), importance
        )  # coverage function

Sorry for the questions, but I find no good explanations for the SP-lime algorithm, because people mostly focus on lime.

marcotcr commented 3 years ago

For example, assuming we have 3 instances with 4 features each and let's say instance 1 and 2 share feature 1, which is the only shared feature among all 3 instances. Further, let's say instance 2 has 3 out of 4 features present and the other instances only have less features present. If we select the one most representative instance, it would be obviously instance 2. But doesnt the importance function return the most important feature among set B? So shouldnt it be possible to rank the features with the importance function?

Yes, I guess you could use the sum of the weights of the feature across the dataset as a measure of feature importance (i.e. this. We do not return that though.

current = np.dot(
        (np.sum(abs(W)[V + [i]], axis=0) > 0), importance
        )  # coverage function

np.sum(abs(W)[V + [i]], axis=0) > 0) gives you which columns of the matrix W have nonzero sum, i.e. which columns (features) are represented in the selected set of explanations. This is the indicator function in Equation 3 in the paper. The dot product just multiplies those columns by their feature importance (I_j in Equation 3 in the paper)