Open jcoreyes opened 8 years ago
Thanks for a nice pull request, together those changes result in almost 2x improvement!
But I would like to keep the code runnable on lesser GPUs as well, therefore I would like to have two ReplayMemory implementations that you can choose using command line switch. Also I would like to keep main code independent of Neon, therefore we need to figure out how to share backend between ReplayMemory and DeepQNetwork without instantiating it in main. Or can we just use two separate backends?
Also I understood, that current version is achieving 38% GPU utilization on Titan X. I wonder what could be done to achieve 100%? Some ideas:
Does this fork really keep replay memory in GPU?
I tried the latest version, but my GPU usage:
$ nvidia-smi
Sun Jul 17 15:27:18 2016
+------------------------------------------------------+
| NVIDIA-SMI 352.93 Driver Version: 352.93 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 960 Off | 0000:02:00.0 Off | N/A |
| 0% 51C P2 32W / 160W | 126MiB / 4095MiB | 28% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2963 C python 112MiB | +-----------------------------------------------------------------------------+
And main memory:
PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20 0 44.118g 6.766g 110244 R 100.0 43.2 534:05.11 python
6.766g is about the size that 1M replay memory in main memory.