muupan / async-rl

Replicating "Asynchronous Methods for Deep Reinforcement Learning" (http://arxiv.org/abs/1602.01783)
MIT License
400 stars 81 forks source link

the action (x4) semantics different? #25

Open mw66 opened 7 years ago

mw66 commented 7 years ago

Hi,

I just noticed:

https://github.com/muupan/async-rl/blob/master/ale.py#L115

each training action is taken 4x times to the game environment?

e.g. user pressed 'down' once, but in your simulated training the environment to take 'down' action 4 times!

I wonder why? and will the result differ from the original paper.

muupan commented 7 years ago

The original paper uses action repeating. See 8. Experimental Setup in http://arxiv.org/abs/1602.01783.