Lantao Yu, Weinan Zhang, Jun Wang, Yong Yu
Shanghai Jiao Tong University, University College London
What is this
Proposed a sequence generation framework called SeqGAN
Comparison with previous researches. What are the novelties/good points?
GAN has limitations when the goal is for generating sequences of discrete tokens
the discrete outputs from the generative model make it difficult to pass the gradient update from the discriminative model to the generative model
the discriminative model can only assess a complete sequence, while for a partially generated sequence, it is non- trivial to balance its current score and the future one once the entire sequence has been generated
Key points
Modeling the data generator as a stochastic policy in reinforcement learning (RL), SeqGAN bypasses the generator differentiation problem by directly performing gradient policy update (REINFORCE).
Apply Monte Carlo search with a roll-out policy to sample the unknown last tokens
How the author proved effectiveness of the proposal?
Summary
Link
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient Code
Author/Institution
Lantao Yu, Weinan Zhang, Jun Wang, Yong Yu Shanghai Jiao Tong University, University College London
What is this
Proposed a sequence generation framework called SeqGAN
Comparison with previous researches. What are the novelties/good points?
Key points
How the author proved effectiveness of the proposal?
Any discussions?
What should I read next?
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks How (not) to train your generative model: Sched- uled sampling, likelihood, adversary?