Closed maikefer closed 4 years ago
Hi @maikefer , thanks for the feedback!
For deconvolution, while only non-negative gradients are backpropagated through ReLU layers, other layers may still pass negative values in backpropagation. If you test with a model where the first layer is a ReLU layer, then the output attributions should be all non-negative as you expect.
Since ResNet architectures generally have convolutional layers prior to the first ReLU, the gradients can become negative based on the weights of this layer. Particularly, there are likely negative weights in the first convolution kernel, which makes these gradients negative. Hope this helps!
Yeah, this helps a lot! Thanks :-)
Hi, first of all thanks for the amazing work you have done with Captum!
I have a question on the Deconvolution implementation. Most likely I'm missing an important implementation detail or using the code wrong. So hopefully you can direct me to the right thing.
I'm using a ResNet with a 8x28x80 input (images over time) with normalised values between -1 and 1. Using the Deconvolution attribution, the resulting values range from -0.15 to 0.15. As far as I understood Deconvolution, the ReLU is applyed in such a way that only the non-negative gradients are backpropagated through the network. In my head, this would mean that the output is also always non-negative. But obviously, in my case this does not hold.
I thought about different things that might cause this problem:
When I apply Captum's Deconvolution, I also get the following User Warnings:
Running on Windows 10, cpu only, Python 3.6., captum 0.2.0, pytorch 1.5.0