coreylynch / async-rl

Tensorflow + Keras + OpenAI Gym implementation of 1-step Q Learning from "Asynchronous Methods for Deep Reinforcement Learning"
MIT License
1.01k stars 174 forks source link

Reward doesn't go up .... #21

Open weichengkuo opened 7 years ago

weichengkuo commented 7 years ago

I ran the async dqn model out of the box with 3 seeds on 7 atari games on 24 threads -- Pong, Breakout, SeaQuest, BeamRider, SpaceInvaders, Qbert, and Enduro. However, the reward stays the same for all the games until 11M global time steps. I've also run Breakout up to 30M global steps with 5 seeds and the reward doesn't go up either. Anybody has this issue?