pytorch / captum

Model interpretability and understanding for PyTorch
https://captum.ai
BSD 3-Clause "New" or "Revised" License
4.81k stars 489 forks source link

GradCAM negative values only for many inputs #740

Closed akshaysravindran closed 2 years ago

akshaysravindran commented 3 years ago

Hi, I implemented the following CNN model in PyTorch and was testing the different model explanation methods in captum. Most of the methods seem to be working fine but when I tried gradcam using the LayerGradCam function, I observed that all values for the correctly predicted inputs are negative (not all inputs but multiple of them). When I set the relu_attributions=True which essentially would end up giving me all zeros. Is there any reason why I might be getting all negative values in gradcam even though the input is correctly predicted by the model?

Input is of shape 1x62x375, 2 class problem

Model definition

n=10 num_units=64 class Net(nn.Module): def init(self): super(Net, self).init() self.conv1 = nn.Conv2d(1, num_units, (1,n)) self.pool1 = nn.MaxPool2d(pool_dim, strides_pool)

    self.conv2 = nn.Conv2d(num_units, num_units, (1,n))
    self.pool2 = nn.MaxPool2d(pool_dim, strides_pool)

    self.conv3 = nn.Conv2d(num_units, num_units, (1,n))
    self.pool3 = nn.MaxPool2d(pool_dim, strides_pool)

    self.conv4 = nn.Conv2d(num_units, num_units, (1,n))
    self.pool4 = nn.MaxPool2d(pool_dim, strides_pool)

    self.conv5 = nn.Conv2d(num_units, num_units, (1,n))       
    self.conv6 = nn.Conv2d(num_units, num_units, (62,1))    # 62 is the entire size of dimension 1

    self.fc1 = nn.Linear(384, num_units)
    self.dropout = nn.Dropout(0.5) 
    self.fc2 = nn.Linear(num_units, 2)

    self.relu1 = nn.ReLU()
    self.relu2 = nn.ReLU()
    self.relu3 = nn.ReLU()
    self.relu4 = nn.ReLU()
    self.relu5 = nn.ReLU()
    self.relu6 = nn.ReLU()
    self.relu7 = nn.ReLU()

def forward(self, x):     
    x = self.pool2(self.relu2(self.conv2(self.pool1(self.relu1(self.conv1(x))))))
    x = self.pool4(self.relu4(self.conv4(self.pool3(self.relu3(self.conv3(x))))))
    x = (self.relu6(self.conv6((self.relu5(self.conv5(x))))))
    x = torch.flatten(x, 1)
    # print(x.size())
    x = self.dropout(self.relu7(self.fc1(x)))     
    x = self.fc2(x)
    return x

.

.

Trian the model

.

Explain the model decision

images, labels= dataiter.next() # Get a batch of data input = images
input.requires_grad = True gradcam = LayerGradCam(model, model.conv5) # Compute gradcam for convlayer 5 gradcam_out = gradcam.attribute(input, 0,relu_attributions=True) # W.r.t. first class (pos = 0) gradcam_out = LayerAttribution.interpolate(gradcam_out, (62, 375)) # Upsample to match the input dimension grad=gradcam_out.cpu().detach().numpy().squeeze()

bilalsal commented 3 years ago

Hi Akshay,

I have two recommendations: 1- Try the attribution with model.conv6 instead of model.conv5. CAM methods were conceived to operate on the activations maps of the final convolutional layer. These maps are used directly for determining the class, based on linear combinations of them (in case of one FC layer) or non-linear combinations (in case of multiple FC layers)

2- Try GuidedGradCAM The algorithm is designed to return saliency maps directly in the input space (no need for LayerAttribution.interpolate). An example is provided in the above link.

Hope this helps! Bilal

akshaysravindran commented 3 years ago

Hi Bilal,

Thank you for taking the time to respond to my question, I appreciate it. I do have a couple of comments on the suggestions and I am not sure if I could take either option for the nature of analysis being done.

1) I need to keep it conv5 only as doing gradcam on conv6 would lose the explanation resolution across dimension 1 of data as conv6 filters has width spanning the entire 1st dimension and I wish to preserve the independence of explanations across them (data is not image). Also, even though gradcam is recommended to be done on the final conv layer, I think it should still work fine for preceding layers (the earlier we go we lose more context but that should not mean gradcam should not work for all the conv layers if I understand correctly).

2) GuidedGradCAM has been shown to have unreliability issues (Sanity Checks for Saliency Maps paper for e.g.) and gradcam is more robust to it which is why I need to stick with gradcam.

I do not understand why gradcam is giving negative values only for a correctly predicted input.

bilalsal commented 3 years ago

Hi Akshay,

It is technically possible to compute GradCAM w.r.t. internal conv layers (preceding the final layer). However, we cannot expect these the computed maps to behave as class activation maps: They do not necessarily tell you which parts of the input triggered target=0 as an output. This has to do with how CNNs work: Starting from conv1, each conv layer respond to some abstract 2D features in the input (e.g. edges, circles, letters), not directly related to any of your target classes. The final conv layer, however, extracts 2D features from the input that are "ready for classification": One FC layer is often enough to define each of your target class as a linear combination of these (flattened) features. In cat-vs-dog classifiers, these features can be as ready as "cat face", "cat ears", "dog shape", "dog ears", "animal pose", etc., so the FC layer can easily assign positive or negative weights for each of these features toward each of the target classes. If you look at earlier conv layers, the features might seem more random to you and not ready for classification. The attribution scores you go for them might not always be as interpretable and informative as for the last conv layer. I recommend you visualize the features maps of model.conv5 for a few inputs to get an intuition of w.r.t. you are actually attributing your output.

You are right that GuidedGradCAM has some issues with the Sanity Checks from Adebayo et al. Nevertheless, it might be worth looking at the attribution it computes, just as a baseline. This will help you set expectations, and debug your LayerGradCam results.

Hope this helps Bilal

akshaysravindran commented 3 years ago

Thank you Bilal for the detailed explanation. That definitely is helpful. Previously, I had used some simulated inputs to know the ground truth, evaluating the mean relevancy across columns, I was seeing very similar explanations for both conv layer's explanations. I replaced conv6 here with an additional identical Conv layer similar to conv5 and evaluated gradcam w.r.t. both conv5 and 6 layers and obtained the following relevancy (62 mean relevancy score in a circle; application-specific representation). I used a separate gradcam implementation to cross-check as well.

Since gradcam gives a spatial relevancy score w.r.t. the activation, my expectation was that conv5 and conv6 should yield very similar relevancy considering the hierarchical nature of convnets (at least in the final few layers).

1 2 3

bilalsal commented 3 years ago

Hi Akshay,

this is a very interesting use case. While the attribution does differ between conv-5 and conv-6, the difference does not seem fundamental to me. In two of your examples, the attribution map seems more focused at conv-6 and more "diffuse" at conv-5. This does match the intuition that the earlier the layer you use the attribution, the more "generic" the features you will see in the saliency maps, while the deeper the layer, the more likely the features are to be classification-ready.

Keep the good analysis up, I hope it helps you answer the fundamental questions you are after.

akshaysravindran commented 3 years ago

Thank you, Bilal, I appreciate you taking the time to provide the suggestions and for the kind words.

I do still having difficulty understanding what might be the reasoning for gradcam to have all negative values (Zero after relu) for a correctly predicted input though?

I tried another architecture and computed the gradcam w.r.t. the last Conv layer but still I find some cases wherein the gradcam output is all negative. Not all examples, but many examples in a batch still end up gettin only zeros when passing through relu basically. Do you have any thoughts on why that might be happening? I wanted to better understand what might be causing this.

Thanks again for your help and insight.

bilalsal commented 3 years ago

Hi @akshaysravindran,

I understand why find it weird that Grad-CAM attributions with certain instances are all negative. It might has to do with your CNN containing 2x fully-connected layers? What about debugging the algorithm or the whereabouts for one of the instances? For example, you could trace back the connections from the correct prediction Y to the relu-6 in order to check, which feature maps in that layer are supposed to "vote" for the prediction. This means that an increased activation in these maps leads to an increase in Y. Grad-CAM relies on these feature maps to compute the attribution. If Y has negative connections to all feature maps, this is likely the reason (though it would be weird that the training led to such connections).

Maybe we can also consult with the Grad-CAM authors if they have encountered such cases? You could post a question at https://github.com/ramprs/grad-cam/issues

@NarineK , @vivekmig: Do you have further insights on this?