insikk / Grad-CAM-tensorflow

tensorflow implementation of Grad-CAM (CNN visualization)
307 stars 103 forks source link

Question regarding the cost function for grad-CAM #4

Closed sjchoi86 closed 6 years ago

sjchoi86 commented 6 years ago

First of all, I really appreciate your implementations. It helped me a lot with starting the grad-cam implementation.

I have a question regarding the cost function. I guess the cost function for computing the gradient should be changed from

prob = end_points['predictions'] # after softmax cost = tf.reduce_sum((prob - labels) ** 2)

to

y = tf.placeholder(tf.float32, [1, 1000]) ... logit = end_points['resnet_v1_50/logits'] # before softmax cost = tf.reduce_sum((logit * y))

Please let me know if I am misunderstanding the equation. Thanks.

insikk commented 6 years ago

Thank you for pointing out this bug. I read the paper again, and I found out that we should use y_c instead of using cost = tf.reduce_sum((prob - labels) ** 2).

image

In this paper, y_c is defined as logit of class c before softmax for classification task. I think your implementation is correct including filtering out by element-wise multiplication with y (which is ground-truth label. I'd avoid variable name y because it confuse reader between predicted value and ground-truth value).

Just be careful not to reduce batch dimension when you use batch inference.

I also found out gradient normalization is not necessary (not mentioned on the grad-cam paper). The normalization process was added when I wrote down this code while I was referring Keras implementation of GradCAM.

I also found out bug in utils.py. When we make weighted sum of feature map, we should use zero-filled array for initial value, not one-filled array.

For more information about new update see this pull request

Any more correction or improvement pull request is welcome :)