Closed zhangrong1722 closed 6 years ago
How to compute the gradient of the score for class c y^c?
In order to compute the gradient of the score for any class c wrt convolutional feature map, we set the gradient to the pre-softmax layer (fc8 in case of VGG) as a one-hot encoding of the target class c and back propagate that gradient till the last convolutional layer (in most frameworks this can be done by simply calling backward).
Why do we use Gradients in Grad-CAM?
Grad-CAM uses the fact later convolutional layers capture more semantic information, and the importance of a neuron can be determined by summing up (Global average pooling) the gradients obtained above. In order to explain this particular image, these importance weights are combined with convolutional feature map A to obtain Grad-CAM.
Difference between visualization and gradient:
In Grad-CAM paper when we refer to visualization, we refer to the visual explanation (Grad CAM heat map or Guided Grad-CAM gradients). Gradients are much more general and are used in different scenarios - for training or generating explanations (like in the case of Grad-CAM).
Thank you!
Hello @ramprs
Is it still acceptable to do the backpropagation from softmax
rather than pre-softmax
till the last convolutional layer?
I tested backpropagation from the softmax
layer and logits
and got quite similar results.
In grad-cam paper,the authors use the gradient of the score for class c namely y^c partial derivative of feature maps A^k of a convolutional layer to obtain the neuron importance weights.But how to compute the gradient of the score for class c y^c?And why?