Open ahmed-fau opened 6 years ago
because autograd.grad takes in a sequence(tuple) of tensors (w.r.t which the gradient of ouput have to be computed as inputs) the return is also a sequence(tuple) of gradient tensors w.r.t each input tensor in the sequence. Here, since you are differentiating only w.r.t one input tensor, you use the [0] th grad tensor from the sequence.
In fact, there is only one element in the tuple returned by the autograd.grad function, [0] just removes the tuple container and takes out the internal tensor information.
This is a reply from the author of the paper on the same question, and I want to help you. https://github.com/igul222/improved_wgan_training/issues/34
Hi,
I am a little confused about taking only the zero index of gradient tensor in the penalty function:
gradients = autograd.grad(outputs=disc_interpolates, inputs=interpolates, grad_outputs=torch.ones(disc_interpolates.size()), create_graph=True, retain_graph=True, only_inputs=True)[0]
Why is it not possible to take the whole grad tensor ?
Best