Open thuangb opened 2 years ago
I believe this saves memory if zero_grad(set_to_none) is used in the correct spots. All this does is tell netD not to store gradients for the backwards pass during the G step.
It isn't strictly necessary from a correctness standpoint as the netD.zero_grad() erases them anyway if they are stored. I think in this repo, it doesn't actually save memory, though, because the gradient tensors are still there, just zero'd.
Hi, I have a question about generator update step. From the code, I see that you disable gradient for the discriminator. I think it is not correct and not necessary because netD will not be updated in this step, isn't it? All other GAN implementation, they do not disable gradient like this).