Question on adaptive gradient clipping

Hi, thank you for the implementation.

I have a question regarding the adaptive gradient clipping. My understanding is in Pytorch lightning, the loss.backward() happens after the trainer receives the loss from pl_module.training_step(). But in gan.py, adaptive gradient clipping takes place before loss is returned, which means the gradients are still in their last state. Is this an approximation by design? Or maybe I could be wrong.

Thank you for your answer in advance.

gtegner / mine-pytorch

Question on adaptive gradient clipping #8