How the gradients pass down if there's a tf.multinomial() sampling process?

LantaoYu / SeqGAN

Implementation of Sequence Generative Adversarial Nets with Policy Gradient

2.08k stars 711 forks source link

Closed kunrenzhilu closed 7 years ago

kunrenzhilu commented 7 years ago

Since the sampling procedure is not differentiable, then how error from Discriminator pass down to train the generator?

kunrenzhilu commented 7 years ago

Policy gradients...NVM...