zplizzi / pytorch-ppo

Simple, readable, yet full-featured implementation of PPO in Pytorch
44 stars 7 forks source link

framestack #2

Open merv801 opened 4 years ago

merv801 commented 4 years ago

Hello. Thanks for sharing this repo. It seems that you have not used the frame stacking(using 3 or 4 consecutive frames as state) in the Atari environment. Is that correct? I also was wondering that have you test it in other Atari games?

zplizzi commented 4 years ago

That's right - it should be pretty easy to add though if you wanted to try it! I think I briefly tried some other games, but unfortunately didn't record the details.

merv801 commented 4 years ago

Thanks for the reply. I am planning to rewrite your implementation to learn about ppo so I would try adding frame-stacking, however I was wondering is there any reason that you didn't add frame-stacking? Does it help with the learning in a specific way? Does is hinder the learning process?

zplizzi commented 4 years ago

Frame-stacking is generally understood to be important for making games more markovian - meaning that the true state of the game is fully represented by the input to the model. Eg in Pong from a single frame, you can't tell the direction the ball is moving - but from a stack of 4 frames you can. Honestly I think I just forgot to add it here - and in Pong it's probably not too important because a policy of "keep the paddle in front of the ball" works decently even if you don't know which way the ball is moving.