YOLOonMe / EMA-attention-module

Implementation Code for the ICCASSP 2023 paper " Efficient Multi-Scale Attention Module with Cross-Spatial Learning" and is available at: https://arxiv.org/abs/2305.13563v2
163 stars 9 forks source link

EMA-attention-module

Results

Training on CIFAR-100 with ResNet for 200 epochs.

| ResNet101 | 32 | 42.70M | 77.78 | 94.39 | - | | + CA | 32 | 46.22M | 80.01 | 94.78 | - | | + EMA | 32 | 42.96M | 80.86 | 95.75 | - | | + SSA-32 | 32 | 51.37M | 80.61 | 95.26 | - |

Training on ImageNet-1k with MobileNetv2 for 400 epochs.

Training on ImageNet-1k with MobileNetv2 for 200 epochs.

Name Resolution #Params MFLOPs Top-1 Acc. Top-5 Acc. BaiduDrive(models)
MobileNetv2 224 3.504M 300.79 72.192 90.534 -
+ EMA 224 - 302 72.55 90.89 ema

Training on COCO 2017 with YOLOv5s for 300 epochs.

Name Resolution #Params MFLOPs mAP@.5 mAP@.5:.95 BaiduDrive(models)
YOLOv5s 640 7.23M 16.5 56.0 37.2 yolov5s(v6.0)
+ CBAM 640 7.27M 16.6 57.1 37.7 cbam
+ SA 640 7.23M 16.5 56.8 37.4 sa
+ ECA 640 7.23M 16.5 57.1 37.6 eca
+ CA 640 7.26M 16.50 57.5 38.1 ca
+ EMA 640 7.24M 16.53 57.8 38.4 ema
+ SSA-32 640 7.27M 0 58.7 38.4
+ SSA-16 640 7.31M 0 58.1 38.5
+ SSA-2 640 8.55M 0 58.3 38.8
+ SSA-1 640 11.50M 0 58.8 39.1

Training on VisDrone 2019 with YOLOv5x.

Name Resolution #Params MFLOPs mAP@.5 mAP@.5:.95 BaiduDrive(models)
YOLOv5x (v6.0) 640 90.96M 314.2 49.29 30.0 -
+ CBAM 640 91.31M 315.1 49.40 30.1 -
+ CA 640 91.28M 315.2 49.30 30.1 -
+ EMA 640 91.18M 315.0 49.70 30.4 ema
+ SSA-32 640 91.18M 315.8 49.80 30.7

References

@INPROCEEDINGS{10096516,
  author={Ouyang, Daliang and He, Su and Zhang, Guozhong and Luo, Mingzhu and Guo, Huaiyong and Zhan, Jian and Huang, Zhijie},
  booktitle={ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={Efficient Multi-Scale Attention Module with Cross-Spatial Learning}, 
  year={2023},
  pages={1-5},
  doi={10.1109/ICASSP49357.2023.10096516}}