hpi-xnor / BMXNet

(New version is out: https://github.com/hpi-xnor/BMXNet-v2) BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet
Apache License 2.0
349 stars 95 forks source link

less forward speed-up when batch size is larger #52

Closed mengwanguc closed 5 years ago

mengwanguc commented 5 years ago

Hi, thanks for the great work first!

I used benchmark_score.py to evaluate the forward latency of Resnet-18 and Resnet-18-binary.

  1. Although Resnet-18-binary speeds up 1.5x at batch size 1, the speed up decrease when I have larger batch. When I have batch size 32, they have almost the same latency. Do you know why does that happen?

  2. The GPU performance of Resnet-18-binary is much worse than the floating point model. I understand that your optimization focused on CPU rather than GPU, but I thought binary model should have at least similar GPU performance as FP model. Why is it much worse?


Here are my running results:

INFO:root:network: resnet-18-binary INFO:root:device: gpu(0) INFO:root:batch size 1, image/sec: 16.735898 INFO:root:batch size 2, image/sec: 25.027532 INFO:root:batch size 4, image/sec: 33.737085 INFO:root:batch size 8, image/sec: 41.273390 INFO:root:batch size 16, image/sec: 47.007433 INFO:root:batch size 32, image/sec: 50.493328

INFO:root:device: cpu(0) INFO:root:batch size 1, image/sec: 6.693615 INFO:root:batch size 2, image/sec: 8.799900 INFO:root:batch size 4, image/sec: 11.307120 INFO:root:batch size 8, image/sec: 12.709365 INFO:root:batch size 16, image/sec: 12.371296 INFO:root:batch size 32, image/sec: 13.402594


INFO:root:network: resnet-18 INFO:root:device: gpu(0) INFO:root:batch size 1, image/sec: 130.296734 INFO:root:batch size 2, image/sec: 192.971986 INFO:root:batch size 4, image/sec: 271.567828 INFO:root:batch size 8, image/sec: 338.648713 INFO:root:batch size 16, image/sec: 461.010049 INFO:root:batch size 32, image/sec: 486.325190

INFO:root:device: cpu(0) INFO:root:batch size 1, image/sec: 4.363451 INFO:root:batch size 2, image/sec: 6.357484 INFO:root:batch size 4, image/sec: 8.384733 INFO:root:batch size 8, image/sec: 10.529395 INFO:root:batch size 16, image/sec: 11.955591 INFO:root:batch size 32, image/sec: 13.027583

yanghaojin commented 5 years ago

I post some words on your question in another thread: #17