ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
49.5k stars 16.08k forks source link

Pruning : Reducing number of kernels, thus reducing the number of parameters DOES NOT improve inference speed! #6598

Closed LeoSouquet closed 2 years ago

LeoSouquet commented 2 years ago

Search before asking

Question

Hi Guys,

I am working to implement Pruning technique inspired from the Model SlimYoloV3. (https://arxiv.org/abs/1907.11093. The objective is to reduce the number of kernels per layer, thus reducing the number of parameters.

However, in a preliminary experience, I took the Yolov3.yaml file and divided all kernels numbers by two. This brought down the number of parameters from: Original YoloV3 : Model Summary: 261 layers, 61508200 parameters, 0 gradients, 154.7 GFLOPs SlimYolv3 : Model Summary: 261 layers, 15394648 parameters, 0 gradients, 38.9 GFLOPs

I fine-tune the pruned model. (From Coco Weights) to bring my mAP back up and tested the inference speed.

However, the inference speed remains the same as the regular YoloV3.yaml. I don't understand at all as the number of parameters a GFLOPS have drastically been reduced.

Any idea why is. that. ??

Additional

Here is my pruned yolov3.yaml

`# parameters nc: 1 # number of classes depth_multiple: 1.0 # model depth multiple width_multiple: 1.0 # layer channel multiple

anchors

anchors:

darknet53 backbone

backbone:

[from, number, module, args]

[[-1, 1, Conv, [16, 3, 1]], # 0 [-1, 1, Conv, [32, 3, 2]], # 1-P1/2 [-1, 1, Bottleneck, [32, False]], [-1, 1, Conv, [64, 3, 2]], # 3-P2/4 [-1, 2, Bottleneck, [64, False]], [-1, 1, Conv, [128, 3, 2]], # 5-P3/8 [-1, 8, Bottleneck, [128, False]], [-1, 1, Conv, [256, 3, 2]], # 7-P4/16 [-1, 8, Bottleneck, [256, False]], [-1, 1, Conv, [512, 3, 2]], # 9-P5/32 [-1, 4, Bottleneck, [512, False]], # 10 ]

YOLOv3 head

head: [[-1, 1, Bottleneck, [512, False]], [-1, 1, Conv, [256, [1, 1]]], [-1, 1, Conv, [512, 3, 1]], [-1, 1, Conv, [256, 1, 1]], [-1, 1, Conv, [512, 3, 1]], # 15 (P5/32-large)

[-2, 1, Conv, [128, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 8], 1, Concat, [1]], # cat backbone P4 [-1, 1, Bottleneck, [256, False]], [-1, 1, Bottleneck, [256, False]], [-1, 1, Conv, [126, 1, 1]], [-1, 1, Conv, [256, 3, 1]], # 22 (P4/16-medium)

[-2, 1, Conv, [64, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 6], 1, Concat, [1]], # cat backbone P3 [-1, 1, Bottleneck, [128, False]], [-1, 2, Bottleneck, [128, False]], # 27 (P3/8-small)

[[27, 22, 15], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) ]`

glenn-jocher commented 2 years ago

@LeoCyclope I don't provide feedback for code customizations or user research, but in general reducing the YOLOv5 compound scaling constants naturally produces speed improvements which are quantified in our README results, i.e.:

Screenshot 2022-02-10 at 14 44 57
LeoSouquet commented 2 years ago

Thanks for your reply 👍 I understand your point.

Quick question, I tried (with. a. regular. yoloV3 provided by you) to do an evaluation (using va.py): I get: A batch of 32: Speed: 0.1ms pre-process, 1.7ms inference, 3.9ms NMS per image at shape (32, 3, 416, 416) A batch of 1: Speed: 0.2ms pre-process, 8.9ms inference, 1.2ms NMS per image at shape (1, 3, 416, 416) This means it takes more than 5 times faster to infer a batch of 32 than a batch of 1. Any idea from your experience if those numbers make sense or not?

Thanks a lot in advance,

LÊo

glenn-jocher commented 2 years ago

@LeoCyclope 👋 Hello! Thanks for asking about inference speed issues. YOLOv5 🚀 can be run on CPU (i.e. --device cpu, slow) or GPU if available (i.e. --device 0, faster). You can determine your inference device by viewing the YOLOv5 console output:

detect.py inference

python detect.py --weights yolov5s.pt --img 640 --conf 0.25 --source data/images/
detect.py

YOLOv5 PyTorch Hub inference

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')

# Images
dir = 'https://ultralytics.com/images/'
imgs = [dir + f for f in ('zidane.jpg', 'bus.jpg')]  # batch of images

# Inference
results = model(imgs)
results.print()  # or .show(), .save()
# Speed: 631.5ms pre-process, 19.2ms inference, 1.6ms NMS per image at shape (2, 3, 640, 640)

Increase Speeds

If you would like to increase your inference speed some options are:

Good luck 🍀 and let us know if you have any other questions!

github-actions[bot] commented 2 years ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!