twitter-archive / torch-ipc

A set of primitives for parallel computation in Torch
Apache License 2.0
95 stars 28 forks source link

What is the correct performance of multiple GPUs? #19

Closed chienlinhuang1116 closed 8 years ago

chienlinhuang1116 commented 8 years ago

Hi,

I tested MNIST dataset with multiple GPUs (Tesla K80) and get the time complexity with different settings. I have some questions based on the following results: (1) Are these results expected? 'disable ACS & enable IPC' is about 10% relative faster than 'enable ACS & disable IPC'. (2) Microsoft reported one-bit SGD methods and had 7 times faster between using single GPU and 8 GPUs (cntk toolkit). I can had about 2 times faster between using single GPU (656.3sec) and 8 GPUs (357.0sec). Is my setting or result correct? (3) What is the expected performance of multiple GPUs using torch-ipc and torch-distlearn?

Thank you very much, Chien-Lin

1-epoch image

5-epoch image

zakattacktwitter commented 8 years ago

Hi,

Glad you got your GPUs working. The MNIST problem is tiny and not really big enough to demonstrate big speed ups using distributed learning. You might consider trying a model with a much more complicated/expensive gradient as well as a exceptionally larger data set size.

I'd include a model of that size as an example if one existed in the public domain.

Hope this helps, Zak

On Wednesday, March 9, 2016, Chien-Lin Huang 黃建霖 notifications@github.com wrote:

Hi, I tested MNIST dataset with multiple GPUs and get the following time complexity with different settings.

[image: image] https://cloud.githubusercontent.com/assets/16109566/13640806/f5a233e2-e5ca-11e5-9024-14d98dba65a0.png

— Reply to this email directly or view it on GitHub https://github.com/twitter/torch-ipc/issues/19.

chienlinhuang1116 commented 8 years ago

Thank you Zak :)