tambetm / simple_dqn

Simple deep Q-learning agent.
MIT License
695 stars 184 forks source link

slow training speed in latest code? #49

Open mw66 opened 7 years ago

mw66 commented 7 years ago

I just tried the latest code, and found the training speed slowed down significantly, it used to be more than >200 steps_per_second, but right now it's ~100 steps_per_second

2017-09-24 15:08:08,844 Epoch #169 2017-09-24 15:08:08,844 Training for 250000 steps 2017-09-24 15:43:51,299 num_games: 1101, average_reward: 24.793824, min_game_reward: 0, max_game_reward: 400 2017-09-24 15:43:51,299 last_exploration_rate: 0.100000, epoch_time: 2143s, steps_per_second: 116 2017-09-24 15:43:51,300 Saving weights to snapshots/breakout_169.prm 2017-09-24 15:43:51,325 Testing for 125000 steps 2017-09-24 15:54:51,175 num_games: 70, average_reward: 240.414286, min_game_reward: 31, max_game_reward: 426 2017-09-24 15:54:51,175 last_exploration_rate: 0.050000, epoch_time: 660s, steps_per_second: 189

I wonder what recent change may caused the downgrade.

tambetm commented 7 years ago

Unfortunately I'm not actively working on this codebase any more. But I would be happy to accept PR. Leaving this open till then.

iNomaD commented 7 years ago

Hello! What is the usual training speed on CPU? I have only 7 steps_per_second (14 step_per_second with MKL) and it is so slow. Is it possible to improve the CPU performance for this algorithm?

tambetm commented 7 years ago

Unfortunately no, you definitely need GPU for training.

iNomaD commented 7 years ago

@tambetm, is it the issue of the Q-learning algorithm itself or of the current implementation? I mean, do you have an idea whether I could reach higher performance with Q-learning on CPU using other tools and frameworks?

tambetm commented 7 years ago

All reinforcement learning algorithms are compute intensive. Asynchronous Advantage Actor-Critic (A3C) is the one that can be more easily parallelized to use multiple CPUs. Search "a3c github" for some example implementations.

mcbrs1a commented 6 years ago

How do you find the steps per second, i am also running on a CPU and just get this output that hangs:

./train.sh Breakout-v0 --environment gym No handlers could be found for logger "gym.envs.registration"

But top shows its processing away

mw66 commented 6 years ago

it's on GPU.

mcbrs1a commented 6 years ago

So it's not possible to run at all on cpu?

tambetm commented 6 years ago

Your error refers to logging and shouldn't stop it from proceeding. The training is just slow, so it might take a while to print out training statistics. I you can run pre-trained Pong and Breakout models against the game without GPU.