mila-iqia / training

8 stars 7 forks source link

add scaling task #17

Closed Delaunay closed 4 years ago

Delaunay commented 5 years ago

Replace multi GPU benchmark with a Scaling benchmark.

Since most multiGPU tasks use data parallel, the speed should scale linearly with the number of GPUs.

So we measure the efficiency of the scaling to evaluate multi GPU settings. This test is made to make sure DataParallel scale linearly across GPUs.

GPUs RTX fp32 Speed up Efficiency
1 183.28 1.00 100.00%
2 357.99 1.95 97.66%
3 507.75 2.77 92.35%
4 678.30 3.70 92.52%
5 849.75 4.64 92.73%
6 1014.84 5.54 92.29%
7 1187.39 6.48 92.55%
8 1351.39 7.37 92.17%

Reported Number avg: 94.03% and sd: 2%

Closer to 100% is better.

Efficiency should be > 90% to pass regardless of hardware vendors

Delaunay commented 5 years ago

python scaling.py --devices 0 1 2 2 3 micro_bench.py --network resnet50 --fp16 1

pending testing on a server

Delaunay commented 5 years ago

benchmark that can be removed after merging:

Delaunay commented 4 years ago
./image_classification/scaling/pytorch/run.sh --repeat 10 --number 5 --network resnet18 --batch-size 32

should work now

breuleux commented 4 years ago

I have merged this manually along with other changes. Thanks!