Some confusion about the detach operation

yupeijei1997 commented 3 years ago

Hi，I don't quite understand why dec_hiddens.grad needs to be detached on line 126. https://github.com/seanie12/CLAPS/blob/main/src/summarization/models.py#L126

The following is the source code of the detach operation. It can be found that it only does the setting of grad_fn to None and Set the requirements_grad of Variable to False for two operations

def detach(self):
        """Returns a new Variable, detached from the current graph.
        Result will never require gradient. If the input is volatile, the output
        will be volatile too.
        .. note::
          Returned Variable uses the same data tensor, as the original one, and
          in-place modifications on either of them will be seen, and may trigger
          errors in correctness checks.
        """
        result = NoGrad()(self)  # this is needed, because it merges version counters
        result._grad_fn = None　　　　 return result

But the grad_fn of dec_hiddens.grad itself is already None, and requirements_grad is also False. Is it necessary to do detach here?

I'm not sure I am right, hope to get your answer, thank you！

seanie12 commented 3 years ago

Yes, it is not necessary to do "detach" operation in this case.

yupeijei1997 commented 3 years ago

Thank you for your answer！

seanie12 / CLAPS

Some confusion about the detach operation #3