idiap / fullgrad-saliency

Full-gradient saliency maps
Other
201 stars 31 forks source link

register_backward_hook deprecated #10

Open suraj-srinivas opened 3 years ago

suraj-srinivas commented 3 years ago

Hello, first of all, I love this implementation and it has been working wonderfully for me! But I noticed that pytorch recently started throwing a deprecation warning for the usage of hooks on line 28 of saliency.tensor_extractor: handle_g = m.register_backward_hook(self._extract_layer_grads)

The warning is:

UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes "

I attempted to fix the issue myself but wasn't able; I don't understand the difference between register_backward_hook and register_full_backward_hook.

Would this be an issue you see worth solving? It currently works for me so there is no immediate demand, and I can always specify that my code will work with a specific version of pytorch, but I love fullgrad so much that I would hate for it to be unable to move forward with pytorch.

Originally posted by @nephina in https://github.com/idiap/fullgrad-saliency/issues/7#issuecomment-808831311

suraj-srinivas commented 3 years ago

Hi @nephina, I'm glad that you like this repository! :) Do you mind elaborating a bit about which domain you use it in?

Regarding the warnings, it seems register_full_backward_hook comes with its own problems where in-place computations are forbidden in the graph. It turns out that most models use in-place computation for ReLU and resnets for the residual connection, which becomes problematic to solve. For immediate use (if required), you can simply modify model definition files to not use in-place computations, and use register_full_backward_hook instead of register_backward_hook. I am looking into more general ways to solve this problem.

nephina commented 3 years ago

I am using it in an image-ranking method I am developing, in order to visualize the salient features for determination of rank. Thanks for the tip, I will try using nets which don't use in-place operations.