glinscott / leela-chess

**MOVED TO https://github.com/LeelaChessZero/leela-chess ** A chess adaption of GCP's Leela Zero
http://lczero.org
GNU General Public License v3.0
760 stars 301 forks source link

Lc0 cudnn does not parallelize properly across multiple GPUs #687

Open chara1ampos opened 6 years ago

chara1ampos commented 6 years ago

I am trying to run Alexander's lc0 cudnn on 3 GPUs (1 Titan V + 2 x 1080). I use the command lc0.exe -w weights.txt --no-smart-pruning --backend=multiplexing "--backend-opts=x(backend=cudnn,gpu=0,max_batch=512),y(backend=cudnn,gpu=1,max_batch=256),z(backend=cudnn,gpu=2,max_batch=256)" --threads=4 and then type go nodes 130000 to do a benchmark. However, using more GPUs than just the Titan V does not help the NPS get any higher. In fact, NPS gets slightly lower. Also, the utilization of each GPU is 30%, whereas if I run lc0 on one GPU alone, its utilization is 90%.

Why doesn't lc0 properly parallelize / fully utilize the multiple GPUs?