Open howardyclo opened 6 years ago
This paper presents SPIRAL, an adversarially trained RL agent that generates a program which is executed by a graphics engine to interpret and sample images in order to mitigate the need for large amounts of supervision and difficulties in scaling inference algorithms to richer dataset.
The agent is rewarded by fooling a discriminator network, and is trained with distributed reinforcement learning without any extra supervision. The discriminator network itself is trained to distinguish between rendered and real images.
Utilizing a discriminator's output as the reward signal for RL is significantly better than directly optimizing the pixel error between rendered image and real image.
In practice for conditional generation of 2D images, they use the discriminator score as the reward for generator instead of L2 distance. The reason can be illustrated below:
Metadata
Authors: Yaroslav Ganin, Tejas Kulkarni, Igor Babuschkin, S. M. Ali Eslami, Oriol Vinyals Organization: DeepMind Release Date: Arxiv 2018 Paper: https://arxiv.org/pdf/1804.01118.pdf Video: https://youtu.be/iSyvwAwa7vk