Closed cardwing closed 7 years ago
When I trained on inverted pendulum, it took me 12 hours to generate 400 episodes of the learning curve (with batch normalization) and 12 hours to generate 1000 episodes (without batch normalization) shown in link .
Regarding the Slow GPU problem:
Since the GPU usage is just 3-10%, I assusme tensorflow is not using GPU properly. Did you check if the tensorflow is using GPU? because the recent version of GPU Tensorflow package automatically uses CPU if CUDA and CUDANN are not properly integrated. I had similar problem recently.
You can use this code to check GPU integration:
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
The slowness is also attributed to batch gradient update at each time steps. Also I noted the update step was fast during the first few hours of training. Contributions are welcome if you have a faster implementations.
As explained in issue #5 , wrapper I meant, wrapper for normalizing values of gym environments i.e) normalizing the values of states, actions, rewards in the range 0-1.
Hi, steven! Recently, I have downloaded your codes and test it on the "Reacher" task. However, I found that with GPU-based tensorflow, it could run 200 episodes per day. It seems a bit slow. Is there anything I need to adjust to fasten the process?(I found that the usage of GPU is low, around 3%~10%, maybe the GPU is not used sufficiently) Plus, you said that we could use one more wrapper to scale the reward, can you explain it more specifically? Thanks a lot!