DingXiaoH / RepVGG

RepVGG: Making VGG-style ConvNets Great Again
MIT License
3.3k stars 433 forks source link

Paper Question - Why less favored than MobileNets for low-powered devices? #101

Open ksachdeva opened 1 year ago

ksachdeva commented 1 year ago

Hi Xiaohan Ding,

This is such excellent work and thanks you for sharing.

I was reading your paper and in conclusion, I saw

RepVGG models are fast, simple, and practical ConvNets designed for the maximum speed on GPU and specialized hardware, less concerning the number of parameters. They are more parameter-efficient than ResNets but may be less favored than the mobile-regime models like MobileNets [16, 30, 15] and ShuffleNets [41, 24] for low-power devices.

I would appreciate it if you could explain why using RepVGG would make less sense to MobileNets.

Is it simply because they are already optimized for fast memory access? or, is it that some optimizations here could create problems for these architectures?

Regards & thanks Kapil

Fred-Erik commented 1 year ago

I'd say that mobile phones have architectures which are more compute-bound than memory-bound compared to GPUs. This means MobileNet and ShuffleNet are more efficient in those contexts, because they use things like depthwise seperable convolutions, which are favorable for those contexts because they results in less (theoretical) FLOPs and parameters. However, when you run architectures like MobileNet on GPU-based hardware like Nvidia's Jetson, you'll notice they are not actually as fast there.

The ideas from RepVGG can easily be extended to depthwise seperable convolutions, i.e. MobileNet-like blocks, though. See Apple's work on MobileOne: https://arxiv.org/abs/2206.04040