Closed fredchenjialin closed 7 years ago
Hmm I still don't know what's the reason for that but one thing that I found from your screenshot is that, you are using all K40 for a single experiments. The pre-defined models in this repo is pretty small (200-400 mb) so you can fit multiple (more than 10) experiments in a single K40. I pushed a6a836edd8cc9cba41e84eaa54dce20c3b24a5b2 to add allow_soft_placement
where default value is True
.
Did you tried training without tqdm
? I know tqdm sometimes cause problem in multi-threading setting but this code does not use such fancy techniques.. so I assume it's not a problem of tqdm
.
In a typical DQN, at the beginning the training runs with random exploration, but later it will gradually start to use the network for exploration. So some sort of slow-down is normal. (although 10x is definitely too much)
@ppwwyyxx I thought about it but the step in both images is quite large. I also found that the log avg_r: 0.001234 ...
is only printed when the random exploration is finished which means they are both in the training phase.
edit: Never mind avg_r
looks always printed.
In training phase you're decreasing epsilon
I guess? This also gradually slows things down.
Oh.. yes. That's right. That explains the slow down. So the frequency of predicting the action rather than choosing random action increases as the step increases in DQN.
Thank you。@carpedm20 @ppwwyyxx About "you can fit multiple (more than 10) experiments in a single K40", how to do this? or How can I use my GPU more efficiently? Can I increase the batch_size or memory_size in main.py?
flags.DEFINE_integer('batch_size', 32, 'The size of batch for minibatch training')
flags.DEFINE_integer('memory_size', 100, 'The size of experience memory (*= scale)')
What you need to do is just pull the code and run it. See changs of a6a836edd8cc9cba41e84eaa54dce20c3b24a5b2 for detials.
ok, thank you. @carpedm20
Isn't this still considered a bug? Even if it's always using the network to predict, the extra work is just one forward time every step. The training happens every 4 steps, so it's 32 forward/backward every 4 steps. The prediction shouldn't make training 10x slower..
hello, I'm very confused about the speed of iteration. Within a few minutes after the program runs, the value of it/s was pretty big and it runs very fast that's what I'd love to see. But, in the process of running, the value of it/s is constantly decreasing. After about 30 minutes, the value will drop from about 10000 to 900, and it's going down.
Is this the problem of setting up the GPU or tqdm?
The graphics card I used is two Nvidia K40.
The picture below shows 30 minutes later
for nvidia-smi