Perform a series of ablation experiments on yolov5 to make it lighter (smaller Flops, lower memory, and fewer parameters) and faster (add shuffle channel, yolov5 head for channel reduce. It can infer at least 10+ FPS On the Raspberry Pi 4B when input the frame with 320×320) and is easier to deploy (removing the Focus layer and four slice operations, reducing the model quantization accuracy to an acceptable range).
ID | Model | Input_size | Flops | Params | Size(M) | Map@0.5 | Map@.5:0.95 |
---|---|---|---|---|---|---|---|
001 | yolo-fastest | 320×320 | 0.25G | 0.35M | 1.4 | 24.4 | - |
002 | YOLOv5-Liteeours | 320×320 | 0.73G | 0.78M | 1.7 | 35.1 | - |
003 | NanoDet-m | 320×320 | 0.72G | 0.95M | 1.8 | - | 20.6 |
004 | yolo-fastest-xl | 320×320 | 0.72G | 0.92M | 3.5 | 34.3 | - |
005 | YOLOXNano | 416×416 | 1.08G | 0.91M | 7.3(fp32) | - | 25.8 |
006 | yolov3-tiny | 416×416 | 6.96G | 6.06M | 23.0 | 33.1 | 16.6 |
007 | yolov4-tiny | 416×416 | 5.62G | 8.86M | 33.7 | 40.2 | 21.7 |
008 | YOLOv5-Litesours | 416×416 | 1.66G | 1.64M | 3.4 | 42.0 | 25.2 |
009 | YOLOv5-Litecours | 512×512 | 5.92G | 4.57M | 9.2 | 50.9 | 32.5 |
010 | NanoDet-EfficientLite2 | 512×512 | 7.12G | 4.71M | 18.3 | - | 32.6 |
011 | YOLOv5s(6.0) | 640×640 | 16.5G | 7.23M | 14.0 | 56.0 | 37.2 |
012 | YOLOv5-Litegours | 640×640 | 15.6G | 5.39M | 10.9 | 57.6 | 39.1 |
See the wiki: https://github.com/ppogg/YOLOv5-Lite/wiki/Test-the-map-of-models-about-coco
Equipment | Computing backend | System | Input | Framework | v5lite-e | v5lite-s | v5lite-c | v5lite-g | YOLOv5s |
---|---|---|---|---|---|---|---|---|---|
Inter | @i5-10210U | window(x86) | 640×640 | openvino | - | - | 46ms | - | 131ms |
Nvidia | @RTX 2080Ti | Linux(x86) | 640×640 | torch | - | - | - | 15ms | 14ms |
Redmi K30 | @Snapdragon 730G | Android(armv8) | 320×320 | ncnn | 27ms | 38ms | - | - | 163ms |
Xiaomi 10 | @Snapdragon 865 | Android(armv8) | 320×320 | ncnn | 10ms | 14ms | - | - | 163ms |
Raspberrypi 4B | @ARM Cortex-A72 | Linux(arm64) | 320×320 | ncnn | - | 84ms | - | - | 371ms |
Raspberrypi 4B | @ARM Cortex-A72 | Linux(arm64) | 320×320 | mnn | - | 71ms | - | - | 356ms |
AXera-Pi | Cortex A7@CPU 3.6TOPs @NPU |
Linux(arm64) | 640×640 | axpi | - | - | - | 22ms | 22ms |
https://zhuanlan.zhihu.com/p/672633849
入群答案:剪枝 or 蒸馏 or 量化 or 低秩分解(任意其一均可)
Model | Size | Backbone | Head | Framework | Design for |
---|---|---|---|---|---|
v5Lite-e.pt | 1.7m | shufflenetv2(Megvii) | v5Litee-head | Pytorch | Arm-cpu |
v5Lite-e.bin v5Lite-e.param |
1.7m | shufflenetv2 | v5Litee-head | ncnn | Arm-cpu |
v5Lite-e-int8.bin v5Lite-e-int8.param |
0.9m | shufflenetv2 | v5Litee-head | ncnn | Arm-cpu |
v5Lite-e-fp32.mnn | 3.0m | shufflenetv2 | v5Litee-head | mnn | Arm-cpu |
v5Lite-e-fp32.tnnmodel v5Lite-e-fp32.tnnproto |
2.9m | shufflenetv2 | v5Litee-head | tnn | arm-cpu |
v5Lite-e-320.onnx | 3.1m | shufflenetv2 | v5Litee-head | onnxruntime | x86-cpu |
Model | Size | Backbone | Head | Framework | Design for |
---|---|---|---|---|---|
v5Lite-s.pt | 3.4m | shufflenetv2(Megvii) | v5Lites-head | Pytorch | Arm-cpu |
v5Lite-s.bin v5Lite-s.param |
3.3m | shufflenetv2 | v5Lites-head | ncnn | Arm-cpu |
v5Lite-s-int8.bin v5Lite-s-int8.param |
1.7m | shufflenetv2 | v5Lites-head | ncnn | Arm-cpu |
v5Lite-s.mnn | 3.3m | shufflenetv2 | v5Lites-head | mnn | Arm-cpu |
v5Lite-s-int4.mnn | 987k | shufflenetv2 | v5Lites-head | mnn | Arm-cpu |
v5Lite-s-fp16.bin v5Lite-s-fp16.xml |
3.4m | shufflenetv2 | v5Lites-head | openvivo | x86-cpu |
v5Lite-s-fp32.bin v5Lite-s-fp32.xml |
6.8m | shufflenetv2 | v5Lites-head | openvivo | x86-cpu |
v5Lite-s-fp16.tflite | 3.3m | shufflenetv2 | v5Lites-head | tflite | arm-cpu |
v5Lite-s-fp32.tflite | 6.7m | shufflenetv2 | v5Lites-head | tflite | arm-cpu |
v5Lite-s-int8.tflite | 1.8m | shufflenetv2 | v5Lites-head | tflite | arm-cpu |
v5Lite-s-416.onnx | 6.4m | shufflenetv2 | v5Lites-head | onnxruntime | x86-cpu |
Model | Size | Backbone | Head | Framework | Design for |
---|---|---|---|---|---|
v5Lite-c.pt | 9m | PPLcnet(Baidu) | v5s-head | Pytorch | x86-cpu / x86-vpu |
v5Lite-c.bin v5Lite-c.xml |
8.7m | PPLcnet | v5s-head | openvivo | x86-cpu / x86-vpu |
v5Lite-c-512.onnx | 18m | PPLcnet | v5s-head | onnxruntime | x86-cpu |
Model | Size | Backbone | Head | Framework | Design for |
---|---|---|---|---|---|
v5Lite-g.pt | 10.9m | Repvgg(Tsinghua) | v5Liteg-head | Pytorch | x86-gpu / arm-gpu / arm-npu |
v5Lite-g-int8.engine | 8.5m | Repvgg-yolov5 | v5Liteg-head | Tensorrt | x86-gpu / arm-gpu / arm-npu |
v5lite-g-int8.tmfile | 8.7m | Repvgg-yolov5 | v5Liteg-head | Tengine | arm-npu |
v5Lite-g-640.onnx | 21m | Repvgg-yolov5 | yolov5-head | onnxruntime | x86-cpu |
v5Lite-g-640.joint | 7.1m | Repvgg-yolov5 | yolov5-head | axpi | arm-npu |
- [ ]
v5lite-e.pt
: | Baidu Drive | Google Drive |
|──────
ncnn-fp16
: | Baidu Drive | Google Drive |
|──────ncnn-int8
: | Baidu Drive | Google Drive |
|──────mnn-e_bf16
: | Google Drive |
|──────mnn-d_bf16
: | Google Drive|
└──────onnx-fp32
: | Baidu Drive | Google Drive |
- [ ]
v5lite-s.pt
: | Baidu Drive | Google Drive |
|──────
ncnn-fp16
: | Baidu Drive | Google Drive |
|──────ncnn-int8
: | Baidu Drive | Google Drive |
└──────tengine-fp32
: | Baidu Drive | Google Drive |
- [ ]
v5lite-c.pt
: Baidu Drive | Google Drive |
└──────
openvino-fp16
: | Baidu Drive | Google Drive |
- [ ]
v5lite-g.pt
: | Baidu Drive | Google Drive |
└──────
axpi-int8
: Google Drive |
Baidu Drive Password: pogg
https://github.com/PINTO0309/PINTO_model_zoo/tree/main/180_YOLOv5-Lite