Closed RaymondJiangkw closed 2 years ago
Great question. Yes, it's true that torch.autograd.functional.jacobian
can be used to compute the "grad network" and the output will be identical with our implementation. There are two main reasons we created the extension.
We wanted to inspect and understand the architecture of the grad network. As far as I know, it's not possible to explicitly instantiate, view, or reuse the computational graph created by torch.autograd.functional.jacobian
.
At the time of implementation, getting the gradient of the input with respect to the output of the network using PyTorch required a full forward pass followed by a call to autograd. Then, during training, this value was used to calculate the loss at each iteration, and then another call to backward()
was required to update the network weights. We thought it far more elegant (and efficient) to directly instantiate the grad network for training. Indeed, there are computational and memory savings with this approach, as we detailed in the paper.
Thank you for your quick response! This answer does solve my question.
Hi, I am really interested in your ideas, and hurry to implement a toy example to test it.
I find that you write a comprehensive and complicated extension in
autoint
to automatically "extract" the gradient network from integral network and implement colorful methods, especially for thedraw()
one.However, I wonder that
torch.autograd.functional.jacobian
has done exactly the same job for you, if we are only talking about getting the partial derivative of the outputs w.r.t the inputs.Attempting to answer this question by myself, I write a simple MLP and try to regress a polynomial function by using
jacobian
or computing derivative manually to "extract" the gradient network. Surprisingly, I find thatjacobian
can be 60x times slower than manual computation.So, is speed one of the reasons behind you writing an extension?