the action (x4) semantics different?

muupan / async-rl

Replicating "Asynchronous Methods for Deep Reinforcement Learning" (http://arxiv.org/abs/1602.01783)

MIT License

400 stars 81 forks source link

Open mw66 opened 7 years ago

mw66 commented 7 years ago

Hi,

I just noticed:

each training action is taken 4x times to the game environment?

e.g. user pressed 'down' once, but in your simulated training the environment to take 'down' action 4 times!

I wonder why? and will the result differ from the original paper.

muupan commented 7 years ago

The original paper uses action repeating. See 8. Experimental Setup in http://arxiv.org/abs/1602.01783.