Hello. I had a question about your network's computational graph during backpropagation. Since you have return_info=True in your call to self.forward(), would that mean that for your self.training_step() that you are ultimately returning a dictionary called info that contains not only your loss (with which you want to optimize against) but also many other loss terms that also contain valid grad_fn properties?
I ask because Lightning is supposed to backpropagate using the loss term only. However, would you agree that it might be safer to explicitly detach all the other loss terms from the computational graph during training_step()?
Hi @amorehead, thanks for another good observation. I haven't had any issues so far but detaching the other variables might be safer indeed. I will consider it for future updates.
Thank you!
Hello. I had a question about your network's computational graph during backpropagation. Since you have
return_info=True
in your call toself.forward()
, would that mean that for yourself.training_step()
that you are ultimately returning a dictionary calledinfo
that contains not only yourloss
(with which you want to optimize against) but also many other loss terms that also contain validgrad_fn
properties?https://github.com/arneschneuing/DiffSBDD/blob/ca2d2ad4451893ec405308134fdffbe94e298b64/lightning_modules.py#L319
I ask because Lightning is supposed to backpropagate using the
loss
term only. However, would you agree that it might be safer to explicitlydetach
all the other loss terms from the computational graph duringtraining_step()
?