ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.26k stars 16.44k forks source link

Attention imporved yolov5 performance #6006

Open 315386775 opened 2 years ago

315386775 commented 2 years ago

Search before asking

Description

Hi, I really appreciate this great work for the cv community. Attention mechanism module can improve the performance of the model. Similar to the C3Ghost and C3SPP module. I have tested the C3-attention module with GCNet, named C3_GC module. With C3_GC module achieves 36.12% map (ori map 35.35) with yolov5s. GFLOPs from 17,1 to 17.3. and Parameter from 7.3M to 7.6M Paper :

`class C3_GC(nn.Module):

C3 module with ContextBlock2d()

def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
    super(C3_GC, self).__init__()
    c_ = int(c2 * e)  # hidden channels
    self.gc = ContextBlock2d(c1)
    self.cv1 = Conv(c1, c_, 1, 1)
    self.cv2 = Conv(c1, c_, 1, 1)
    self.cv3 = Conv(2 * c_, c2, 1)  # act=FReLU(c2)
    self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])

def forward(self, x):
    out = torch.cat((self.m(self.cv1(x)), self.cv2(self.gc(x))), dim=1)
    out = self.cv3(out)
    return out`

Use case

You need modify the yolov5s.yaml

`backbone:

[from, number, module, args]

[[-1, 1, Focus, [64, 3]], # 0-P1/2 [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 [-1, 3, C3, [128]], [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 [-1, 9, C3_GC, [256, True]], [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 [-1, 9, C3_GC, [512, True]], [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 [-1, 1, SPP, [1024, [5, 9, 13]]], [-1, 3, C3_GC, [1024, False]], # 9 ]`

Additional

No response

Are you willing to submit a PR?

github-actions[bot] commented 2 years ago

👋 Hello @315386775, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

Requirements

Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

ChinaRush commented 2 years ago

the latest yolov5 achieves 37.2% map which is better than your result with less parameter

glenn-jocher commented 2 years ago

@315386775 thank you for your research and your contributions towards including attention!

Could you try to implement your updates on the latest v6.0 release of YOLOv5 to compare to the current baseline of 37.2 mAP as @ChinaRush mentions? See https://github.com/ultralytics/yolov5/releases/tag/v6.0

315386775 commented 2 years ago

@glenn-jocher @ChinaRush. thank you for your suggestion. I have tried to implement my backbone updates on the latest v6.0 release of YOLOv5. The result shows :

a. original results :
Model Summary: 213 layers, 7225885 parameters, 0 gradients, 16.5 GFLOPs, map get 0.560 with IOU=0.50, map get 0.372 with IOU=0.50:095,

b. I reproduced the original version result :
Model Summary: 213 layers, 7225885 parameters, 0 gradients, 16.5 GFLOPs, map get 0.568 with IOU=0.50, map get 0.372 with IOU=0.50:095,

c. I tested the improved C3_GC module results :
Model Summary: 252 layers, 7573984 parameters, 0 gradients, 16.8 GFLOPs map get 0.573 with IOU=0.50, map get 0.377 with IOU=0.50:095.

Conclusion:

  1. Map with iou 0.5 accuracy has been improved by 0.5 points.
  2. Map with iou 0.5:0.95 accuracy has been improved by 0.5 points.
  3. GFLOPs from 16,5 to 16.8. and Parameter from 7225885 to 7573984

i want to contribute this new backbone improvement. thanks.

ChinaRush commented 2 years ago

You did a great job. Your model has a deeper layer which contribute to your map , resulting in more parameters, more flops, or even larger weight files ? lower speed ?.Object detection is a trade-off of spend and accuracy, so yolov5 has n, s, m, l, x model to meet different requirements.

315386775 commented 2 years ago

I agree with you, but you can find that there are many improved modules for C3 in the V6.0 version, such as C3 module with TransformerBlock() , C3 module with SPP() and C3 module with GhostBottleneck(). C3SPP() and C3TR() also have more parameters, more flops, or even larger weight files ? lower speed ? Attention modules has been confirmed in visual tasks. yolov5 should have more features for everyone to learn and use. thanks.

ChinaRush commented 2 years ago

you are right.

315386775 commented 2 years ago

@glenn-jocher Another enhancement. The degree of feature decoupling between different layers of FPN can be appropriately increased. add conv layer before and later with FPN.

d. I tested the improved FPN with conv results : Model Summary: 11024035 parameters, 22.7 GFLOPs map get 0.596 with IOU=0.50, map get 0.405 with IOU=0.50:095.

The training speed and map between yolov5s and yolov5m.

Michelvl92 commented 2 years ago

@315386775 how dit you implemented this: "add conv layer before and later with FPN." ?

glenn-jocher commented 2 years ago

@315386775 sounds interesting! Could you add this last result to the first 3 in https://github.com/ultralytics/yolov5/issues/6006#issuecomment-1000603844 please? Maybe a table showing the four runs would be easier to read/compare also.

315386775 commented 2 years ago
Model map0.5:0.95 map0.5:0.05 params(M) FLOPS
5s offical 37.2 56.0 7.2 16.5
5s recurrent 37.2 56.8 7.2 16.5
5sbackbone + C3GC 37.7 57.3 7.5 16.8
5sC3GC + FPN_conv 40.5 59.6 11.0 22.7
5m 45.2 63.9 21.2 49.0

@glenn-jocher table result show.

glenn-jocher commented 2 years ago

@315386775 thanks for the table! I see you added 5m to it, which is a good idea to get an upper bookend.

I'm laughing to myself a bit because this is the main problem we all face, we add a few features to the model and then test it, and it falls somewhere along a line of performance connecting models smaller and larger than it, and often times it's not immediately clear that the updates are good or bad. More often the updates introduce compromises like what we see here, where they improve the result but may use more resources.

If anything looking at the table, it looks like '5s recurrent' provides "free" improvement? Maybe that's worth investigating a bit more? What's the main change in that model?

With the other ones I just can't say, we'd need some profiling results also like in the main README table to ensure they don't slow down inference too much also.

315386775 commented 2 years ago
Model map0.5:0.95 map0.5:0.05 Speed CPU b1 Speed 2080ti b1 Speed 2080ti b32 params(M) FLOPS
5s offical 37.2 56.0 98 6.4(V100) 0.9(V100) 7.2 16.5
5s 37.2 56.8 47.7 7.3 1.0 7.2 16.5
5sbackbone + C3GC 37.7 57.3 63.6 8.4 1.4 7.5 16.8
5sC3GC + FPN_conv 40.5 59.6 92.7.0 12.0 1.5 11.0 22.7
5m 45.2 63.9 224 8.2(V100) 1.7(V100) 21.2 49.0

@glenn-jocher conclusion and explanation. a. 5s recurrent's version with no changes compare with original. i trained the model with offical config and hyperparameter. b. conside the speed and map. my new features need more influence times. and the increased map is smaller compare to 5m. c. The new model with C3GC() feature proved to be effective. All yolo version's map should be increased with the new feature. d. like the C3SPP() and C3TR() feature. can i pull this C3GC() feature?

ppogg commented 2 years ago

Hi, sir. Are you interested in this project?May be we could communicate. https://github.com/ppogg/YOLOv5-Lite -7ed05eef7175bf10

315386775 commented 2 years ago

@ppogg Sure. YOLOv5-Lite is also a great and useful project.

Mengyao-Zhang commented 2 years ago

@glenn-jocher另一个增强功能。可以适当增加FPN不同层之间的特征解耦程度。在 FPN 之前和之后添加 conv 层。

d。我用 conv 结果测试了改进的 FPN: 模型摘要:11024035 参数,22.7 GFLOPs map get 0.596 with IOU=0.50,map get 0.405 with IOU=0.50:095。

yolov5s和yolov5m之间的训练速度和图。

Could you please provide the network structure diagram and detailed parameters for adding convolution? Thank you very much!

wilile26811249 commented 2 years ago

@315386775 Hi, Could you share how to implement the FPN_Conv block and the model config file? Thank you!

315386775 commented 2 years ago

@wilile26811249 ok. i will share the code in my git.

prasannalathakothala commented 2 years ago

Traceback (most recent call last): File "/content/yolov5/train.py", line 636, in main(opt) File "/content/yolov5/train.py", line 533, in main train(opt.hyp, opt, device, callbacks) File "/content/yolov5/train.py", line 124, in train model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device) # create File "/content/yolov5/models/yolo.py", line 103, in init self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist File "/content/yolov5/models/yolo.py", line 284, in parsemodel m = nn.Sequential((m(args) for in range(n))) if n > 1 else m(*args) # module File "/content/yolov5/models/yolo.py", line 284, in m = nn.Sequential((m(args) for _ in range(n))) if n > 1 else m(*args) # module File "/content/yolov5/models/common.py", line 766, in init self.gc = ContextBlock2d() TypeError: init() missing 4 required positional arguments: 'inplanes', 'planes', 'pool', and 'fusion' A bug like this is coming. please help to solve

glenn-jocher commented 2 years ago

TODO: view Lite improvements from https://github.com/ultralytics/yolov5/issues/6006#issuecomment-1010106820

zhiqwang commented 2 years ago

Just FYI @glenn-jocher and all here,

Seems that TexasInstruments has also released their YOLOv5-ti-lite version https://github.com/TexasInstruments/edgeai-yolov5#yolov5-ti-lite-model-definition

And the notes for embedded device scenarios is very useful: https://github.com/TexasInstruments/edgeai-modelzoo/blob/master/models/vision/detection/readme.md#notes

The benefit of RegNetX backbone: https://github.com/TexasInstruments/edgeai-mmdetection/blob/master/docs/det_modelzoo.md#object-detection-model-zoo

We have config files for ResNet, RegNetX and MobileNet backbone architectures. Overall, the RegNetX family of architectures strike a good balance between complexity, accuracy and easiness of quantization.

glenn-jocher commented 2 years ago

@zhiqwang wow that's quite amazing. I used to spend a lot of time on the TI website looking at their chips, in my particle physics days Ultralytics developed circuit boards that had a few TI parts including an advanced 32ch TI ADC converter in one of my designs. I never imagined that they would be using my designs one day.

zhiqwang commented 2 years ago

Hi @glenn-jocher , Ultralytics and TI are both awesome, and I believe the combination will accelerate the adoption of AI.

yxx-byte commented 2 years ago

Please,Hello, please use C3-GC to train the weight file of the backbone network, preferably the weight file trained under the COCO dataset.

jccmoing-bit commented 2 years ago

@315386775 I am very interested in your experimental results ,Could you share how to implement the FPN_Conv block and the model config file? Thank you!

315386775 commented 2 years ago

https://github.com/ppogg/YOLOv5-Lite/pull/124, @yxx-byte @jccmoing-bit pretrain model and network here