Slight difference between Deconvnet and Guided BP

123dddd commented 3 years ago

Thanks for the GREAT repo! I noticed, there is only one small difference between the algorithms of this 2 visualization methods:

For Guided BP we use: return (F.relu(grad_in[0]),) For Deconvnet we use: return (F.relu(grad_out[0]),)

For me the Guided BP is understandable, but I am confused about the deconvnet. The deconvnet consist of unpooling, ReLu and deconvolution layers(https://www.quora.com/How-does-a-deconvolutional-neural-network-work), but I only find the ReLu operation is implemented by using F.relu. Maybe I wrongly understood the Deconvnet visulization method or miss something about the use of Pytorch. I hope you can give me the right direction!

Thanks a lot.

kazuto1011 commented 3 years ago

The deconvnet consist of unpooling, ReLu and deconvolution layers.

The unpooling and deconvolution are the backward routing of pooling and convolution, respectively. The relu is a negative clipping of gradient flows. Therefore you can say the guided bp consists of unpooling, relu, deconvolution, and backward activation. The difference is just the backward activation, which routes gradients based on the forward pass, not the gradients themselves. The relevant papers only consider relu activation, i.e. backward relu.

deconvolution	backward relu	gradient relu	unpooling
vanilla bp	✓	✓	✓
deconvnet	✓	✓	✓
guided bp	✓	✓	✓	✓

For Guided BP we use: return (F.relu(grad_in[0]),) For Deconvnet we use: return (F.relu(grad_out[0]),)

grad_in is a gradient after the backward relu, while grad_out is a gradient before the backward relu. F.relu() is the gradient relu. In vanilla backpropagation, we return the raw grad_in to the next layer.

123dddd commented 3 years ago

Thanks for the helpful and fast reply! So in conclusion, the sole difference between the 3 approaches, is how they backpropagate through the ReLu(As show in the above table). As for the rest parts, i. e. unpooling layers and deconvolution layers, they are used in the level of algorithm implementation with no difference. We just focus on the ReLu part in the backward route when we want to get the different kinds of saliency map. Am I right?

kazuto1011 commented 3 years ago

Yes. Figure 1 of the guided bp paper is helpful to understand this point (https://arxiv.org/pdf/1412.6806.pdf).

123dddd commented 3 years ago

Thanks again for you kind help! Now this 3 methods are clearer to me : )

kazuto1011 / grad-cam-pytorch

Slight difference between Deconvnet and Guided BP #34