Training on environments with long episode length

NVlabs / GA3C

Hybrid CPU/GPU implementation of the A3C algorithm for deep reinforcement learning.

BSD 3-Clause "New" or "Revised" License

652 stars 195 forks source link

Training on environments with long episode length #29

Closed JimMcMahon closed 7 years ago

JimMcMahon commented 7 years ago

Hello!

I'm currently trying to train on a problem that requires anywhere from 500 to 10,000 steps per episode. The training for this is excruciatingly slow when using the default config values. I've been messing around with some of the parameters but haven't been making any headway. Any recommendations ways to improve the training speed?

edit: the main thing i tried to modify was setting tmax to a very large number to try and batch each episode into a single update. This helped, however not as much as I hoped.

ifrosio commented 7 years ago

Hi, the number of frames per episode on some Atari games is of the same order of magnitude (e.g., for Pong, 1,000 - 2,000 frames per episode), so this does not seems to be a very crucial point for speed. Increasing tmax can increase the number of frames per second, but the convergence speed my be slower, since you do less updates per second. Have you tried simply increasing the learning rate? Which speed (FPS) do you achieve for Pong on the same machine? BTW, without any other info on your task it is hard to make detailed comments.

JimMcMahon commented 7 years ago

Thanks for the reply! I looked into the problem more carefully and discovered it isn't an issue with GA3C but the simulation environment I'm using for my task. Apologies for the false alarm.

ifrosio commented 7 years ago

Great. Let's close this then.