Open mw66 opened 7 years ago
Hi,
I just noticed:
https://github.com/muupan/async-rl/blob/master/ale.py#L115
each training action is taken 4x times to the game environment?
e.g. user pressed 'down' once, but in your simulated training the environment to take 'down' action 4 times!
I wonder why? and will the result differ from the original paper.
The original paper uses action repeating. See 8. Experimental Setup in http://arxiv.org/abs/1602.01783.
Hi,
I just noticed:
https://github.com/muupan/async-rl/blob/master/ale.py#L115
each training action is taken 4x times to the game environment?
e.g. user pressed 'down' once, but in your simulated training the environment to take 'down' action 4 times!
I wonder why? and will the result differ from the original paper.