Amarintine commented 4 years ago

Hi,i have tested the network with mobilenet,but i can see that the speed is not so fast compared with mobilenetv2, #ghostnet flops:147.505M, params:3.903M

mobilenetv2 flops:312.852M, params:2.225M

so i don't know what's wrong with it?the ghostnet params is higher, in fact test running time code: x = torch.randn(32, 3, 224, 224) for _ in range(30): with torch.no_grad(): inputs = x.cuda() outputs = model(inputs) print(time.time() - t) t = time.time() it seems that mobilenetv2 is faster than ghostnet???

iamhankai commented 4 years ago

On GPUs, when FLOPS is small, the main constrain of speed is the bandwidth. Depthwise conv is not so fast on GPUs but super fast on CPUs. That is why MobileNet and GhostNet mainly test speed on ARM/CPU. If you are familar to Chinese, you could refer to https://zhuanlan.zhihu.com/p/122943688 and https://www.zhihu.com/question/339909499.

iamhankai commented 4 years ago

In addition, your code for testing GPU time is not correct. You should add torch.cuda.synchronize() after forward. Please refer to https://discuss.pytorch.org/t/measuring-gpu-tensor-operation-speed/2513

iamhankai / ghostnet.pytorch

test speed flops and parameters #36

mobilenetv2 flops:312.852M, params:2.225M