I am trying to figure out why in line 86 of gradcam.py, negative gradients are being clipped.
Why is this line necessary to visualize the gradients, don't we also want to show negative gradients since they essentially imply that the greater the average influence of that channel then the greater the decrease in the loss? I may be overthinking things here, so I hope you could shed some light on this as I cannot wrap my head around it. Thanks!
Hello, if my memory serves me right, it is because the authors use a ReLU after taking weighted average, which is basically what line 86 does. You can have a look at the paper for operations.
I am trying to figure out why in line 86 of gradcam.py, negative gradients are being clipped.
Why is this line necessary to visualize the gradients, don't we also want to show negative gradients since they essentially imply that the greater the average influence of that channel then the greater the decrease in the loss? I may be overthinking things here, so I hope you could shed some light on this as I cannot wrap my head around it. Thanks!