pytorch / captum

Model interpretability and understanding for PyTorch
https://captum.ai
BSD 3-Clause "New" or "Revised" License
4.81k stars 489 forks source link

NoiseTunnel and GuidedGradCam: Bug or feature? #429

Closed maikefer closed 3 years ago

maikefer commented 4 years ago

I tried putting a NoiseTunnel around the GuidedGradCam attribution. The resulting attribution are just zeros, which is not what I would have expected. Is this a bug? Am I missing something about either of the two techniques (NoiseTunnel / GuidedGradCam)? I tried with the 3 different smoothgrad/smoothgrad_sq/vargrad and with 5 and 10 samples.

Would be greatful for some hints! Thanks :-)

bilalsal commented 4 years ago

Hi @maikefer,

you might be facing the same issue as in #393: is it possible that the smoothed attributions just have a low variance, yet are non-zero? Did you measure the max - min of the returned attribution tensor?

maikefer commented 4 years ago

Hi @bilalsal I'm not using captums visualization but am just looking at whatever the

 vis = GuidedGradCam(model, layer=model.layer4[2].conv3)  
 noise_tunnel = NoiseTunnel(vis)
 noise_tunnel.attribute(sample, target=label, n_samples=10, nt_type="smoothgrad")

returns. And unfotunately this is a tensor with only 0s (min and max are 0, shape is correct). I am using a ResNext Architecture, GuidedGradCam without noiseTunnel works perfectly fine. NoiseTunnel on other attributions also work fine, just this combination has weird results.

vivekmig commented 4 years ago

Hi @maikefer , this is surprising, combining GuidedGradCam and NoiseTunnel should work fine. Can you try setting n_samples to 1 and stdevs to 0 and see if the results are meaningful? This should be exactly the same as running GuidedGradCam directly.

I tried to reproduce this with a random input and the pretrained ResNeXt in torchvision, but wasn't able to reproduce it. If you can share more of your code or a colab notebook to reproduce the issue, we can also try to look into it further.

maikefer commented 4 years ago

Hi @vivekmig, I set n_samples to 1 and stdevs to 0, no change, the resulting tensor is still all 0. My input sample is in the range -1...1 and it is four dimensional (1, x, y, t). Maybe this causes an issue? I'll see how I can best share a meaningful snippet of my code, but i'll need some days to do this. :-)

vivekmig commented 4 years ago

Hi @maikefer , this is likely a bug if GuidedGradCam does not match NoiseTunnel with n_samples = 1 and stdevs = 0. Sounds good, It will be easier to debug with a code example once you can share one.

You can also try printing the input at the beginning of your model forward function and make sure it matches the input provided for attribution. Also, you can double-check the type of the input there, adding noise of 0 in NoiseTunnel could be changing the dtype of the tensor. It might also be helpful to see if GuidedBackprop or GradCAM with NoiseTunnel match the corresponding result without NoiseTunnel.

One thing related to your shape (1, x, y, t): I assume this means the spatial dimensions are the second and third dimensions. In GuidedGradCAM, we interpolate the GradCAM result of dimensions after the first two, which is consistent with the behavior of the convolution operator (assuming the first two dimensions are number of examples and number of channels, following the general PyTorch NCHW assumption. I don't think this is the cause of the issue you're seeing, but it would be good to confirm that it makes sense to be interpolating these dimensions.