I ran the async dqn model out of the box with 3 seeds on 7 atari games on 24 threads -- Pong, Breakout, SeaQuest, BeamRider, SpaceInvaders, Qbert, and Enduro. However, the reward stays the same for all the games until 11M global time steps. I've also run Breakout up to 30M global steps with 5 seeds and the reward doesn't go up either. Anybody has this issue?
I ran the async dqn model out of the box with 3 seeds on 7 atari games on 24 threads -- Pong, Breakout, SeaQuest, BeamRider, SpaceInvaders, Qbert, and Enduro. However, the reward stays the same for all the games until 11M global time steps. I've also run Breakout up to 30M global steps with 5 seeds and the reward doesn't go up either. Anybody has this issue?