nagataka / Read-a-Paper

Survey

6 stars 1 forks source link

SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient #28

Open nagataka opened 5 years ago

nagataka commented 5 years ago

Summary

Link

SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient Code

Author/Institution

Lantao Yu, Weinan Zhang, Jun Wang, Yong Yu Shanghai Jiao Tong University, University College London

What is this

Proposed a sequence generation framework called SeqGAN

Comparison with previous researches. What are the novelties/good points?

GAN has limitations when the goal is for generating sequences of discrete tokens
- the discrete outputs from the generative model make it difficult to pass the gradient update from the discriminative model to the generative model
- the discriminative model can only assess a complete sequence, while for a partially generated sequence, it is non- trivial to balance its current score and the future one once the entire sequence has been generated

Key points

Modeling the data generator as a stochastic policy in reinforcement learning (RL), SeqGAN bypasses the generator differentiation problem by directly performing gradient policy update (REINFORCE).
Apply Monte Carlo search with a roll-out policy to sample the unknown last tokens

How the author proved effectiveness of the proposal?

Any discussions?

What should I read next?

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks How (not) to train your generative model: Sched- uled sampling, likelihood, adversary?