zju-vipa / OSGAN

One-Stage GAN for Efficient Adversarial Learning. The implementation of CVPR 2021 paper: Training Generative Adversarial Networks in One Stage.
36 stars 7 forks source link

How to extract the gradient of the generator in one-stage #1

Open naoki7090624 opened 3 years ago

naoki7090624 commented 3 years ago

Thanks for sharing your code and it's really a great work.

I would like to apply this technique to other models using GANs to reduce the training time. For implementation, I have two questions about your code.

  1. Why do you set reduction='none' and get the average from loss_fake in one stage ? train_dcgan_asymmetric_one_stage_simple.py#L60
  2. How do you extract the gradient of the generator from the gradient of the Discriminator? In this code train_dcgan_asymmetric_one_stage_simple.py#L72, I think that the generator backpropagates the same gradient as the discriminator.

Thank you.

sccbhxc commented 3 years ago

Thanks for your watching on our work.

  1. Since we need image-wise loss in the following function: get_gradient_ratios() instead of batch average loss to obtain image-wise gradient ratio (gamma is a vector, whose size is equal to batch size of fake images), we set reduction='none in loss_fake and loss_g.
  2. We scale the gradients derived from generator as the gradients of discriminator. The function is achieved in GradientScaler : https://github.com/zju-vipa/OSGAN/blob/fdfea7e7d9c7627d19fca26b992aaac502a02313/utils/modules.py#L4, which corresponds to Eq.S8 in supplemental material: https://openaccess.thecvf.com/content/CVPR2021/supplemental/Shen_Training_Generative_Adversarial_CVPR_2021_supplemental.pdf.
naoki7090624 commented 3 years ago

Thanks for your reply.

Does the GradientScaler function only scale the gradient of the generator and not the Discriminator?

Sorry for the additional question. Can OSGAN be used even if there are losses other than adversarial loss? For example, generator loss = adversarial loss + perceptual loss + reconstruction loss.

ElegantLee commented 2 years ago

Thanks for your reply.

Does the GradientScaler function only scale the gradient of the generator and not the Discriminator?

Sorry for the additional question. Can OSGAN be used even if there are losses other than adversarial loss? For example, generator loss = adversarial loss + perceptual loss + reconstruction loss.

I have the same question about how to do back propagation when we have more than adversarial loss(Generator have more than adv loss). The way I solved this problem is: loss_pack = loss_pack + other g losses. I'm not sure if this method is right. I hope the author can answer it. :joy: