pytorch / captum

Model interpretability and understanding for PyTorch
https://captum.ai
BSD 3-Clause "New" or "Revised" License
4.96k stars 499 forks source link

Negative Deconvolution outputs - what am I missing? #408

Closed maikefer closed 4 years ago

maikefer commented 4 years ago

Hi, first of all thanks for the amazing work you have done with Captum!

I have a question on the Deconvolution implementation. Most likely I'm missing an important implementation detail or using the code wrong. So hopefully you can direct me to the right thing.

I'm using a ResNet with a 8x28x80 input (images over time) with normalised values between -1 and 1. Using the Deconvolution attribution, the resulting values range from -0.15 to 0.15. As far as I understood Deconvolution, the ReLU is applyed in such a way that only the non-negative gradients are backpropagated through the network. In my head, this would mean that the output is also always non-negative. But obviously, in my case this does not hold.

I thought about different things that might cause this problem:

When I apply Captum's Deconvolution, I also get the following User Warnings:

C:\Users\...\anaconda3\envs\...\lib\site-packages\captum\attr\_utils\gradient.py:33: UserWarning: Input Tensor 0 did not already require gradients, required_grads has been set automatically.
  "required_grads has been set automatically." % index
C:\Users\...\anaconda3\envs\...\lib\site-packages\captum\attr\_core\guided_backprop_deconvnet.py:56: UserWarning: Setting backward hooks on ReLU activations.The hooks will be removed after the attribution is finished
  "Setting backward hooks on ReLU activations."

Running on Windows 10, cpu only, Python 3.6., captum 0.2.0, pytorch 1.5.0

vivekmig commented 4 years ago

Hi @maikefer , thanks for the feedback!

For deconvolution, while only non-negative gradients are backpropagated through ReLU layers, other layers may still pass negative values in backpropagation. If you test with a model where the first layer is a ReLU layer, then the output attributions should be all non-negative as you expect.

Since ResNet architectures generally have convolutional layers prior to the first ReLU, the gradients can become negative based on the weights of this layer. Particularly, there are likely negative weights in the first convolution kernel, which makes these gradients negative. Hope this helps!

maikefer commented 4 years ago

Yeah, this helps a lot! Thanks :-)