Hi @man0007 , I think this paper could be a good starting point. I will try to explain from an intuition point of view, in the context of an image classification model. My knowledge of the field is outdated, so it's possible the methodology has been improved/changed with a better version.
LIME
Key idea: given a data point x_i, you will sample a set of locally nearby data points based on x_i's interpretable representation, then approximate it with a self-explanatory model, say, decision tree or linear regression. This would provide you with a rule-based decision or a ranking of interpretable explanation that is easily understood by a human.
What is interpretable representation? Takes the image below for example: a single pixel is not meaningful to us, but the pixel area that covers the tree branch, or the bird's wings, is meaningful, hence naturally become interpretable.
Use the same example, one way LIME can define an interpretable representation of this image, is by saying there are 3 interpretable components, tree branch, bird, and background. You can represent them as a binary vector, with the original images encoded as [1, 1, 1]. If you want to sample, say, remove the bird, the corresponding encoding is [1, 0, 1]. To translate this to a proper input that can be fed to the Image classification model we aim to explain, say, VGG16, you simply zero-out those pixels covering the bird area (in practice, they use image segmentation to generate these so-called "super-pixel"). You can then generate the label for this newly sampled image.
Let's say you have a bunch of sampled binary vector X, and the corresponding output of VGG16 Y, you then train a linear model that can best fit with (X, Y). This is the explanation LIME can provide. Each component is given a score, relative to how important it is to explain the model's prediction on the image.
Quantitative Input Influence
Key idea (marginal contribution): Alice, Bob, and Carol work together on a project and receive a divisible reward. Ideally, we should distribute the reward proportionally to the contribution of each. The amount of work done by 3 people, versus less Alice, is the marginal contribution of Alice in the 3-person setup. One has to consider all possible ways a group can form (one-person group is also counted), then average out to come up with the final conclusion how much of the reward that person is worthy. As the number of persons grows, it becomes computationally intractable to scan through all possibilities, hence we do sampling instead.
In the image context, with the same image above, let's say you can split up the pixel to 100 4x4 square pixel set, non-overlapping to each other. Each pixel set is equivalent to one person. A pixel set "participates in the project" will be kept intact, whereas a "non-participant" would be zeroed out. You do n sampling:
randomly select a subset S of pixel sets to be active, translates them to proper input that can be fed to the model you want to explain and collect the prediction probability, ie. how much it's a picture of a bird! Note that the probability is the quantitative of interest in formal language.
make one pixel set S_i in S as a "non-participant", then follow the same procedure to compute output of the remaining active pixel sets.
The gap of the two predictions is the marginal contribution of S_i in the set S.
At the end of the sampling, you can estimate the contribution of all 100 pixel sets, we call this influence score. This could be represented as a heatmap covering the original image, as a humanly interpretable explanation.
Hi @man0007 , I think this paper could be a good starting point. I will try to explain from an intuition point of view, in the context of an image classification model. My knowledge of the field is outdated, so it's possible the methodology has been improved/changed with a better version.
LIME
Key idea: given a data point x_i, you will sample a set of locally nearby data points based on x_i's interpretable representation, then approximate it with a self-explanatory model, say, decision tree or linear regression. This would provide you with a rule-based decision or a ranking of interpretable explanation that is easily understood by a human. What is interpretable representation? Takes the image below for example: a single pixel is not meaningful to us, but the pixel area that covers the tree branch, or the bird's wings, is meaningful, hence naturally become interpretable.
Source: https://www.freeimages.com/photo/bird-1361326
Use the same example, one way LIME can define an interpretable representation of this image, is by saying there are 3 interpretable components, tree branch, bird, and background. You can represent them as a binary vector, with the original images encoded as [1, 1, 1]. If you want to sample, say, remove the bird, the corresponding encoding is [1, 0, 1]. To translate this to a proper input that can be fed to the Image classification model we aim to explain, say, VGG16, you simply zero-out those pixels covering the bird area (in practice, they use image segmentation to generate these so-called "super-pixel"). You can then generate the label for this newly sampled image.
Let's say you have a bunch of sampled binary vector X, and the corresponding output of VGG16 Y, you then train a linear model that can best fit with (X, Y). This is the explanation LIME can provide. Each component is given a score, relative to how important it is to explain the model's prediction on the image.
Quantitative Input Influence
Key idea (marginal contribution): Alice, Bob, and Carol work together on a project and receive a divisible reward. Ideally, we should distribute the reward proportionally to the contribution of each. The amount of work done by 3 people, versus less Alice, is the marginal contribution of Alice in the 3-person setup. One has to consider all possible ways a group can form (one-person group is also counted), then average out to come up with the final conclusion how much of the reward that person is worthy. As the number of persons grows, it becomes computationally intractable to scan through all possibilities, hence we do sampling instead.
In the image context, with the same image above, let's say you can split up the pixel to 100 4x4 square pixel set, non-overlapping to each other. Each pixel set is equivalent to one person. A pixel set "participates in the project" will be kept intact, whereas a "non-participant" would be zeroed out. You do n sampling:
At the end of the sampling, you can estimate the contribution of all 100 pixel sets, we call this influence score. This could be represented as a heatmap covering the original image, as a humanly interpretable explanation.
Source: https://glassboxmedicine.com/2019/06/11/cnn-heat-maps-class-activation-mapping-cam/
Hope this helps.
On this occasion, I want to make clear that this code is based on the work of Anupam Datta, Shayak Sen, and Yair Zick - the authors of "Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems". I simply rework the code to serve my own purpose. Please gives full credit to them, as I was mistaken as the author in a similar issue you posted on LIME.
Sincerely, Vinh