freezing D when optimizing G

YoojLee commented 2 years ago

https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/003efc4c8819de47ff11b5a0af7ba09aee7f5fc1/models/cycle_gan_model.py#L185

Thanks for the nice work! I am quite confused that freezing D when optimizing G is just for a speedup (according to a reply to the previous issues of this topic). I thought it was quite important to freeze D when optimizing G since G and D should be isolated from each other in the optimization process. Does it really have nothing to do with the "performance" of training? I would like to know that the code I mentioned was written for the purpose of mere speed-up of training process.

Thanks!

junyanz commented 2 years ago

These are two separate questions: (1) should we optimize G and D jointly or not? (2) If we optimize G and D separately, do we need to compute gradients for D while updating G.

For (2), As long as we don't do optimizer_D.step(), the gradients for D will not be used in SGD. Therefore, we set require False in Line 185.

For (1), most authors optimize them separately, following the original paper's practice.

YoojLee commented 2 years ago

@junyanz Thank you for your reply! Still, I am wondering why we set require False in Line 185. Is it because it is mandatory or it is for any other purpose such as speed-up or something?

junyanz / pytorch-CycleGAN-and-pix2pix

freezing D when optimizing G #1425