Closed alexrakowski closed 5 years ago
We call this method guided backpropagation, because it adds an additional guidance signal from the higher layers to usual backpropagation. This prevents backward flow of negative gradients, corresponding to the neurons which decrease the activation of the higher layer unit we aim to visualize.
Am I missing something?
I think you should also be masking the gradients wrt. output (top) gradients, in a similar manner like here
Hey, you mean the gradients should also be filtered by non-zero forward output of the layers? Having re-read the paper (especially in Figure1), I get what you mean. In fact, the implemented approach (in the repository) looks closer to deconvolution (Zeiler&Fergus).
I will go over it during the weekend, there are few more things I want to change so I might as well implement them all together.
Thanks.
Edit: I was experimenting with a few things and for AlexNet and VGG the output does not change because positive output in the forward pass also corresponds to positive output in the backprop when ReLUs are updated. In the paper they are using another architecture from NiN which is built on Mlpconv layer which makes a difference. Nevertheless, I will update the code but results remain the same. (Ps. I wasn't actually convinced that this is the case before doing additional experiments, you can play around to convince yourself that in GBP paper Figure 1, b) 'Backward pass:deconvnet and Backward pass: guided backprop actually leads to the same output for regular conv. architectures like AlexNet and VGG.)
Here are the changes: https://github.com/utkuozbulak/pytorch-cnn-visualizations/commit/5d052f1043362e83b4934d56343f05c28894a807
Isn't there a bug in the way negative gradients are filtered out for ReLU's in Guided Backprop?
What you do is:
torch.clamp(grad_in[0], min=0.0)
,while the paper states:
It seems to me that you forgot to include the top gradient.