idiap / fullgrad-saliency

Full-gradient saliency maps
Other
202 stars 32 forks source link

Why not using hooks? #7

Closed FrancescoSaverioZuppichini closed 4 years ago

FrancescoSaverioZuppichini commented 4 years ago

Dear all,

First of all, congrats on the very nice paper and implementation. Your current API design prevents 'plug and play' with different models without having to rewrite them. Why not using the hooks API (https://pytorch.org/tutorials/beginner/former_torchies/nnft_tutorial.html#forward-and-backward-function-hooks) to get the parameters you need?

Thank you,

Francesco Saverio Zuppichini

suraj-srinivas commented 4 years ago

Hi,

It's probably true that I can greatly simplify things by using hooks. The main reason I'm not using it here is that in my research code I treated consecutive convolution and batch norm as a single linear layer, and thus needed to compute overall biases of the effective linear layer. While this doesn't make a difference for the full-gradient decomposition, it technically changes the FullGrad saliency results very very slightly. However, it probably shouldn't matter in practice. I'll try to update the code using hooks. Thanks for the suggestion!

Regards, Suraj Srinivas

FrancescoSaverioZuppichini commented 4 years ago

Glad to be useful :)

NRauschmayr commented 4 years ago

Take a look at the smdebug library. smdebug allows to automatically capture tensors from your model training and you only need to indicate a regular expression of tensor names that should be emitted (no need to modify the model).

I read the paper 'Full-Gradient Representation for Neural Network Visualization' and it Is a really great paper! Congrats!

Based on it, I created a sample notebook a while ago that uses smdebug to create saliency maps based on the fullgrad method.

suraj-srinivas commented 4 years ago

The notebook looks amazing! I'll take a look at the libraries you mention. Thanks!

suraj-srinivas commented 4 years ago

I've re-implemented the algorithm using hooks, and hopefully now it's easier to use. Thanks again for the suggestion!

nephina commented 3 years ago

Hello, first of all, I love this implementation and it has been working wonderfully for me! But I noticed that pytorch recently started throwing a deprecation warning for the usage of hooks on line 28 of saliency.tensor_extractor: handle_g = m.register_backward_hook(self._extract_layer_grads)

The warning is:

UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes "

I attempted to fix the issue myself but wasn't able; I don't understand the difference between register_backward_hook and register_full_backward_hook.

Would this be an issue you see worth solving? It currently works for me so there is no immediate demand, and I can always specify that my code will work with a specific version of pytorch, but I love fullgrad so much that I would hate for it to be unable to move forward with pytorch.

suraj-srinivas commented 3 years ago

Hi, I'm redirecting the discussion on this to another issue thread.