Edge/mobile CNN+Transformer hybrid DNN backbone inference benchmark (currently only for computer vision task)
we filter the model which satisfy one of the condition below:
Model | Top-1 | Top-1 //20 est. |
Top-1 //50 est. |
#params | GMACs | wight |
---|---|---|---|---|---|---|
efficientformerv2_s0 | 76.2 | 76.3 | 76.0 | 3.5M | 0.40G | eformer_s0_450.pth |
efficientformerv2_s1 | 79.7 | 78.8 | 79.6 | 6.1M | 0.65G | eformer_s1_450.pth |
efficientformerv2_s2 | 82.0 | 82.0 | 81.9 | 12.6M | 1.25G | eformer_s2_450.pth |
SwiftFormer_XS | 75.7 | 76.1 | 75.3 | 3.5M | 0.4G | SwiftFormer_XS_ckpt.pth |
SwiftFormer_S | 78.5 | 78.3 | 78.3 | 6.1M | 1.0G | SwiftFormer_S_ckpt.pth |
SwiftFormer_L1 | 80.9 | 80.7 | 81.8 | 12.1M | 1.6G | SwiftFormer_L1_ckpt.pth |
EMO_1M | 71.5 | 70.7 | 68.3 | 1.3M | 0.26G | EMO_1M.pth |
EMO_2M | 75.1 | 74.8 | 73.6 | 2.3M | 0.44G | EMO_2M.pth |
EMO_5M | 78.4 | 78.2 | 77.6 | 5.1M | 0.90G | EMO_5M.pth |
EMO_6M | 79.0 | 79.2 | 77.9 | 6.1M | 0.96G | EMO_6M.pth |
edgenext_xx_small | 71.2 | 70.8 | 70.4 | 1.3M | 0.26G | edgenext_xx_small.pth |
edgenext_x_small | 74.9 | 74.9 | 74.9 | 2.3M | 0.54G | edgenext_x_small.pth |
edgenext_small/usi | 81.1 | 80.8 | 80.0 | 5.6M | 1.26G | edgenext_small_usi.pth |
mobilevitv2_050 | 70.2 | 69.9 | 66.7 | 1.4M | 0.5G | mobilevitv2-0.5.pt |
mobilevitv2_075 | 75.6 | 75.0 | 74.4 | 2.9M | 1.0G | mobilevitv2-0.75.pt |
mobilevitv2_100 | 78.1 | 77.9 | 76.9 | 4.9M | 1.8G | mobilevitv2-1.0.pt |
[x] mobilevitv2_125 | 79.7 | 79.1 | 80.7 | 7.5M | 2.8G | mobilevitv2-1.25.pt |
[x] mobilevitv2_150 | 81.5 | 80.8 | 81.8 | 10.6M | 4.0G | mobilevitv2-1.5.pt |
[x] mobilevitv2_175 | 81.9 | 80.8 | 81.1 | 14.3M | 5.5G | mobilevitv2-1.75.pt |
[x] mobilevitv2_200 | 82.3 | 82.0 | 83.1 | 18.4M | 7.2G | mobilevitv2-2.0.pt |
mobilevit_xx_small | 68.9 | 68.9 | 66.6 | 1.3M | 0.36G | mobilevit_xxs.pt |
mobilevit_x_small | 74.7 | 74.3 | 73.9 | 2.3M | 0.89G | mobilevit_xs.pt |
mobilevit_small | 78.2 | 77.7 | 78.1 | 5.6M | 2.0 G | mobilevit_s.pt |
LeViT_128S | 76.5 | 75.9 | 76.2 | 7.8M | 0.30G | LeViT-128S.pth |
LeViT_128 | 78.6 | 79.3 | 78.2 | 9.2M | 0.41G | LeViT-128.pth |
LeViT_192 | 79.9 | 79.8 | 79.3 | 11 M | 0.66G | LeViT-192.pth |
[x] LeViT_256 | 81.6 | 81.2 | 81.4 | 19 M | 1.12G | LeViT-256.pth |
Model | Top-1 | Top-1 //20 est. |
Top-1 //50 est. |
#params | GMACs | wight |
---|---|---|---|---|---|---|
resnet50 | 80.4 | 80.3 | 81.1 | 25.6M | 4.1G | |
mobilenetv3_large_100 | 75.8 | 75.7 | 75.3 | 5.5M | 0.29G | |
tf_efficientnetv2_b0 | 78.4 | 78.1 | 76.7 | 7.1M | 0.72G | |
tf_efficientnetv2_b1 | 79.5 | 79.3 | 79.4 | 8.1M | 1.2G | |
tf_efficientnetv2_b2 | 80.2 | 81.7 | 80.4 | 10.1M | 1.7G | |
tf_efficientnetv2_b3 | 81.6 | 81.9 | 82.0 | 14.4M | 3.0G |