The following is the source code of the detach operation. It can be found that it only does the setting of grad_fn to None and
Set the requirements_grad of Variable to False for two operations
def detach(self):
"""Returns a new Variable, detached from the current graph.
Result will never require gradient. If the input is volatile, the output
will be volatile too.
.. note::
Returned Variable uses the same data tensor, as the original one, and
in-place modifications on either of them will be seen, and may trigger
errors in correctness checks.
"""
result = NoGrad()(self) # this is needed, because it merges version counters
result._grad_fn = None return result
But the grad_fn of dec_hiddens.grad itself is already None, and requirements_grad is also False. Is it necessary to do detach here?
I'm not sure I am right, hope to get your answer, thank you!
Hi,I don't quite understand why dec_hiddens.grad needs to be detached on line 126. https://github.com/seanie12/CLAPS/blob/main/src/summarization/models.py#L126
The following is the source code of the detach operation. It can be found that it only does the setting of grad_fn to None and Set the requirements_grad of Variable to False for two operations
But the grad_fn of dec_hiddens.grad itself is already None, and requirements_grad is also False. Is it necessary to do detach here?
I'm not sure I am right, hope to get your answer, thank you!