Question regarding the cost function for grad-CAM

Thank you for pointing out this bug. I read the paper again, and I found out that we should use y_c instead of using cost = tf.reduce_sum((prob - labels) ** 2).

In this paper, y_c is defined as logit of class c before softmax for classification task. I think your implementation is correct including filtering out by element-wise multiplication with y (which is ground-truth label. I'd avoid variable name y because it confuse reader between predicted value and ground-truth value).

Just be careful not to reduce batch dimension when you use batch inference.

I also found out gradient normalization is not necessary (not mentioned on the grad-cam paper). The normalization process was added when I wrote down this code while I was referring Keras implementation of GradCAM.

I also found out bug in utils.py. When we make weighted sum of feature map, we should use zero-filled array for initial value, not one-filled array.

For more information about new update see this pull request

Any more correction or improvement pull request is welcome :)

insikk / Grad-CAM-tensorflow

Question regarding the cost function for grad-CAM #4