Is the gradient for Encoder doubled?

Clement25 commented 3 years ago

I notice there are two backpropagations for the generator and encoder.

https://github.com/wiseodd/controlled-text-generation/blob/master/train_discriminator.py#L120-L122 https://github.com/wiseodd/controlled-text-generation/blob/master/train_discriminator.py#L130-L132

After the back-propagation of loss G, it runs zero_grad to clear all the grads of the generator in the auto-encoder. However, the encoder is also in the forward path, and its gradient preserves. Then it computes the encoder loss and back-propagate again. So the gradient of VAE loss is accumulated twice and the final value is doubled. Is my understanding correct here？

Sry2016 commented 3 years ago

Hello ! Have you solved this problem?

Sry2016 commented 3 years ago

You are right. So I think we should add """trainer_E.zero_grad()“”“ here but without trainer_E.step().

wiseodd / controlled-text-generation

Is the gradient for Encoder doubled? #27