TMats / survey

Summary of Paper Survey
https://tmats.github.io/survey/
16 stars 2 forks source link

Deep Recurrent Q-Learning for Partially Observable MDPs #146

Open TMats opened 6 years ago

TMats commented 6 years ago
TMats commented 6 years ago

DRQN(Deep Recurrent Q-Network)

DQNはatariで4フレーム使うことでMDPになる

For many games, a single game screen is insufficient to determine the state of the system. DQN infers the full state of an Atari game by expanding the state representation to encompass the last four game screens. Many games that were previously POMDPs now become MDPs.