Closed pingjun18-li closed 7 years ago
你好,下面是运行nvidia-smi 得到的结果,我怎么修改代码,才能使得这些gpu高效工作呢?因为每次进行训练时,总是只有一个gpu在运行,每小时才能迭代1400次左右? 0 GeForce GTX TIT... Off | 0000:02:00.0 On | N/A | | 22% 38C P8 15W / 250W | 72MiB / 12286MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX TIT... Off | 0000:03:00.0 Off | N/A | | 22% 39C P8 15W / 250W | 23MiB / 12287MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 GeForce GTX TIT... Off | 0000:82:00.0 Off | N/A | | 22% 37C P8 15W / 250W | 23MiB / 12287MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 GeForce GTX TIT... Off | 0000:83:00.0 Off | N/A | | 22% 38C P8 14W / 250W | 23MiB / 12287MiB | 0% Default
Did you compile the caffe with MPI enabled and run the training with mpirun as in this script?
mpirun
你好,下面是运行nvidia-smi 得到的结果,我怎么修改代码,才能使得这些gpu高效工作呢?因为每次进行训练时,总是只有一个gpu在运行,每小时才能迭代1400次左右? 0 GeForce GTX TIT... Off | 0000:02:00.0 On | N/A | | 22% 38C P8 15W / 250W | 72MiB / 12286MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX TIT... Off | 0000:03:00.0 Off | N/A | | 22% 39C P8 15W / 250W | 23MiB / 12287MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 GeForce GTX TIT... Off | 0000:82:00.0 Off | N/A | | 22% 37C P8 15W / 250W | 23MiB / 12287MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 GeForce GTX TIT... Off | 0000:83:00.0 Off | N/A | | 22% 38C P8 14W / 250W | 23MiB / 12287MiB | 0% Default