Closed 123dddd closed 3 years ago
The deconvnet consist of unpooling, ReLu and deconvolution layers.
The unpooling and deconvolution are the backward routing of pooling and convolution, respectively. The relu is a negative clipping of gradient flows. Therefore you can say the guided bp consists of unpooling, relu, deconvolution, and backward activation. The difference is just the backward activation, which routes gradients based on the forward pass, not the gradients themselves. The relevant papers only consider relu activation, i.e. backward relu.
deconvolution | backward relu | gradient relu | unpooling | |
---|---|---|---|---|
vanilla bp | ✓ | ✓ | ✓ | |
deconvnet | ✓ | ✓ | ✓ | |
guided bp | ✓ | ✓ | ✓ | ✓ |
For Guided BP we use: return (F.relu(grad_in[0]),) For Deconvnet we use: return (F.relu(grad_out[0]),)
grad_in
is a gradient after the backward relu, while grad_out
is a gradient before the backward relu. F.relu()
is the gradient relu. In vanilla backpropagation, we return the raw grad_in
to the next layer.
Thanks for the helpful and fast reply! So in conclusion, the sole difference between the 3 approaches, is how they backpropagate through the ReLu(As show in the above table). As for the rest parts, i. e. unpooling layers and deconvolution layers, they are used in the level of algorithm implementation with no difference. We just focus on the ReLu part in the backward route when we want to get the different kinds of saliency map. Am I right?
Yes. Figure 1 of the guided bp paper is helpful to understand this point (https://arxiv.org/pdf/1412.6806.pdf).
Thanks again for you kind help! Now this 3 methods are clearer to me : )
Thanks for the GREAT repo! I noticed, there is only one small difference between the algorithms of this 2 visualization methods:
For Guided BP we use: return (F.relu(grad_in[0]),) For Deconvnet we use: return (F.relu(grad_out[0]),)
For me the Guided BP is understandable, but I am confused about the deconvnet. The deconvnet consist of unpooling, ReLu and deconvolution layers(https://www.quora.com/How-does-a-deconvolutional-neural-network-work), but I only find the ReLu operation is implemented by using F.relu. Maybe I wrongly understood the Deconvnet visulization method or miss something about the use of Pytorch. I hope you can give me the right direction!
Thanks a lot.