arneschneuing / DiffSBDD

A Euclidean diffusion model for structure-based drug design.
MIT License
339 stars 74 forks source link

Question about gradient flow during training #10

Closed amorehead closed 1 year ago

amorehead commented 1 year ago

Hello. I had a question about your network's computational graph during backpropagation. Since you have return_info=True in your call to self.forward(), would that mean that for your self.training_step() that you are ultimately returning a dictionary called info that contains not only your loss (with which you want to optimize against) but also many other loss terms that also contain valid grad_fn properties?

https://github.com/arneschneuing/DiffSBDD/blob/ca2d2ad4451893ec405308134fdffbe94e298b64/lightning_modules.py#L319

I ask because Lightning is supposed to backpropagate using the loss term only. However, would you agree that it might be safer to explicitly detach all the other loss terms from the computational graph during training_step()?

arneschneuing commented 1 year ago

Hi @amorehead, thanks for another good observation. I haven't had any issues so far but detaching the other variables might be safer indeed. I will consider it for future updates. Thank you!