TheMTank / cups-rl

Customisable Unified Physical Simulations (CUPS) for Reinforcement Learning. Experiments run on the ai2thor environment (http://ai2thor.allenai.org/) e.g. using A3C, RainbowDQN and A3C_GA (Gated Attention multi-modal fusion) for Task-Oriented Language Grounding (tasks specified by natural language instructions) e.g. "Pick up the Cup or else"
http://www.themtank.org
MIT License
48 stars 7 forks source link

Is there any ways to improve trainning speed? #15

Closed jiafei1224 closed 5 years ago

jiafei1224 commented 5 years ago

I am currently training the rainbow model on a GPU, i try to train 2 models simultaneously however, the training time become much slower. Is there any suggestion that i can improve my training speed? I am training them on a GeForce RTX 2080 Ti

beduffy commented 5 years ago

https://github.com/allenai/ai2thor/issues/123

jiafei1224 commented 5 years ago

Thanks for the reply. Can i also check with you, if you guys have try to run rainbow with the data-efficient parameters? --target-update 2000 \ --T-max 100000 \ --learn-start 1600 \ --memory-capacity 100000 \ --replay-frequency 1 \ --multi-step 20 \ --architecture data-efficient \ --hidden-size 256 \ --learning-rate 0.0001 \ --evaluation-interval 10000

etendue commented 5 years ago

in the training loop, if you run multiple AI2Thor environments, most of time is spent by agent interacting with environments; the neural network training costs less time as assumed. Since AI2Thor environment uses CPU as well as GPU(for OpenGL rendering), so consider to upgrade CPU, memory and GPU together. In my previous trials, the bottle neck is CPU and memory size. A GPU of K80 is good enough to run ~ 4 parallel environments. I used a GCP configuration with 2xK80 GPU + 16 cores CPU + 32 GB memory can achieve 150 FPS(steps/s) performance.

fernandotorch commented 5 years ago

I would suggest you read the rainbow dqn to understand better what those parameters mean @jiafei1224 . That being said you might want to try a different multi-step value, beause it is something that even deep mind guys tried within a range of 3-5, being 3 steps the parameter that gaves them the best results. Basically those are steps that you are looking ahead and computing your action-values for (don't trust me, follow at the code yourself), so 20 is a very high number (computationally speaking) and I'm not sure if it will even provide any improvements to your training.

Since the parameter tuning is somehow problem dependant, I would suggest you try to reproduce the paper partially by training on one of the atari games and check out yourself how the parameters influence the whole learning process. That shouldn't take long to solve in a "high-end" laptop even and once you understand it you can jump to your own problem and learn more about how the different problem setting also modifies the way you should tune them. In summary, the paper is complex but is the fundamental base of what you are doing so you should try to understand it before you jump into the code. After that try a simple example that you know it will work, and then you will be ready to make your own assumptions on how to tune it. I hope this answer was useful, but if you have more questions don't hesitate to ask again.

jiafei1224 commented 5 years ago

Thanks for the help @fgtoralesch @beduffy @etendue!