nagataka / Read-a-Paper

Survey

6 stars 1 forks source link

Model-Based Reinforcement Learning for Atari #45

Open nagataka opened 2 years ago

nagataka commented 2 years ago

Summary

Link

Author/Institution

What is this

Simulated Policy Learning (SimPLe)
- Video prediction techniques + policy training within the learned model
Look like Dyna-style World models

Comparison with previous researches. What are the novelties/good points?

Key points

a skip-connected convolutional encoder and decoder, which outputs the next predicted frame and expected reward
a convolutional inference network which approximates the posterior given the next frame
LSTM based network, which is trained to approximate each bit given the previous ones

スクリーンショット 2022-01-28 15 34 37

How the author proved effectiveness of the proposal?

Experiments on Atari games
- with a budget restricted to 100K time steps – roughly to two hours of a play time
- outperforms state-of-the-art model-free algorithms (Rainbow)

Any discussions?

What should I read next?