kazuto1011 / grad-cam-pytorch

PyTorch re-implementation of Grad-CAM (+ vanilla/guided backpropagation, deconvnet, and occlusion sensitivity maps)
MIT License
784 stars 174 forks source link

Slight difference between Deconvnet and Guided BP #34

Closed 123dddd closed 3 years ago

123dddd commented 3 years ago

Thanks for the GREAT repo! I noticed, there is only one small difference between the algorithms of this 2 visualization methods:

For Guided BP we use: return (F.relu(grad_in[0]),) For Deconvnet we use: return (F.relu(grad_out[0]),)

For me the Guided BP is understandable, but I am confused about the deconvnet. The deconvnet consist of unpooling, ReLu and deconvolution layers(https://www.quora.com/How-does-a-deconvolutional-neural-network-work), but I only find the ReLu operation is implemented by using F.relu. Maybe I wrongly understood the Deconvnet visulization method or miss something about the use of Pytorch. I hope you can give me the right direction!

Thanks a lot.

kazuto1011 commented 3 years ago

The deconvnet consist of unpooling, ReLu and deconvolution layers.

The unpooling and deconvolution are the backward routing of pooling and convolution, respectively. The relu is a negative clipping of gradient flows. Therefore you can say the guided bp consists of unpooling, relu, deconvolution, and backward activation. The difference is just the backward activation, which routes gradients based on the forward pass, not the gradients themselves. The relevant papers only consider relu activation, i.e. backward relu.

deconvolution backward relu gradient relu unpooling
vanilla bp
deconvnet
guided bp

For Guided BP we use: return (F.relu(grad_in[0]),) For Deconvnet we use: return (F.relu(grad_out[0]),)

grad_in is a gradient after the backward relu, while grad_out is a gradient before the backward relu. F.relu() is the gradient relu. In vanilla backpropagation, we return the raw grad_in to the next layer.

123dddd commented 3 years ago

Thanks for the helpful and fast reply! So in conclusion, the sole difference between the 3 approaches, is how they backpropagate through the ReLu(As show in the above table). As for the rest parts, i. e. unpooling layers and deconvolution layers, they are used in the level of algorithm implementation with no difference. We just focus on the ReLu part in the backward route when we want to get the different kinds of saliency map. Am I right?

kazuto1011 commented 3 years ago

Yes. Figure 1 of the guided bp paper is helpful to understand this point (https://arxiv.org/pdf/1412.6806.pdf).

123dddd commented 3 years ago

Thanks again for you kind help! Now this 3 methods are clearer to me : )