QY1994-0919 / CFPNet

Centralized Feature Pyramid for Object Detection
Apache License 2.0
238 stars 22 forks source link

基于 MMYOLO 在 YOLOv5-s 中增加 CFP MSCOCO 上效果提升 1 个点 #10

Open hhaAndroid opened 1 year ago

hhaAndroid commented 1 year ago

作者好@QY1994-0919:

我基于你的代码快速在 YOLOv5 中试了下,在 MSCOCO val 数据集上性能为 38.6,没有加之前是 37.6,正好涨 1 个点。

我是基于 MMYOLO 进行实验,配置和代码见个人分支: https://github.com/hhaAndroid/mmyolo/tree/bifpn_demo (请忽略分支名,瞎写的)

具体做法是:

(1) 新建 https://github.com/hhaAndroid/mmyolo/blob/bifpn_demo/mmyolo/models/necks/yolov5_cpafpn.py

核心代码非常简单:

@MODELS.register_module()
class YOLOv5CPAFPN(YOLOv5PAFPN):

    def build_upsample_layer(self, idx: int) -> nn.Module:
        """build upsample layer."""
        if idx == len(self.in_channels) - 1:
            evc_block=EVCBlock(make_divisible(self.out_channels[idx-1], self.widen_factor),
                     make_divisible(self.out_channels[idx-1], self.widen_factor),
                     channel_ratio = 4, base_channel = 16)
            return nn.Sequential(nn.Upsample(scale_factor=2, mode='nearest'), evc_block)
        else:
            return nn.Upsample(scale_factor=2, mode='nearest')

(2) 新建配置 https://github.com/hhaAndroid/mmyolo/blob/bifpn_demo/configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py

_base_ = 'yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py'

model = dict(neck=dict(type='YOLOv5CPAFPN'))

find_unused_parameters = True

(3) 开启分布式训练 cd mmyolo

bash ./tools/dist_train.sh configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py 8

结果

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.386                                                                                             
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.581                                                                                             
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.417                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.218                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.433                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.500                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.316                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.522                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.575                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.382                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.631                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.721  

(4) 特征图可视化 详细用法见: https://mmyolo.readthedocs.io/zh_CN/latest/user_guides/visualization.html

基于训练好的模型,对 EVCBlock 模块的输入前和输出的特征图进行可视化,查看模块效果

cd mmyolo
python demo/featmap_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0] neck.upsample_layers[0][1] --channel-reduction squeeze_mean

image

(5) grad-cam and grad-free cam

grad cam 可视化 EVCBlock 输入前效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0] 

image

grad cam 可视化 EVCBlock 输入后效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][1] 

zidane

AblationCAM 可视化 EVCBlock 输入前效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0]  --method ablationcam

image

AblationCAM 可视化 EVCBlock 输入后效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][1]  --method ablationcam

image

如果觉得有必要合入到 MMYOLO 主分支中,欢迎留言!

dongzhang89 commented 1 year ago

@hhaAndroid
Thanks for the report! Best luck -Dong

zhmcnmb commented 1 year ago

你好,请问可以提供一下在yolov5-s网络添加CFP后训练的权重吗?

nyj-ocean commented 1 year ago

作者好@QY1994-0919:

我基于你的代码快速在 YOLOv5 中试了下,在 MSCOCO val 数据集上性能为 38.6,没有加之前是 37.6,正好涨 1 个点。

我是基于 MMYOLO 进行实验,配置和代码见个人分支: https://github.com/hhaAndroid/mmyolo/tree/bifpn_demo (请忽略分支名,瞎写的)

具体做法是:

(1) 新建 https://github.com/hhaAndroid/mmyolo/blob/bifpn_demo/mmyolo/models/necks/yolov5_cpafpn.py

核心代码非常简单:

@MODELS.register_module()
class YOLOv5CPAFPN(YOLOv5PAFPN):

    def build_upsample_layer(self, idx: int) -> nn.Module:
        """build upsample layer."""
        if idx == len(self.in_channels) - 1:
            evc_block=EVCBlock(make_divisible(self.out_channels[idx-1], self.widen_factor),
                     make_divisible(self.out_channels[idx-1], self.widen_factor),
                     channel_ratio = 4, base_channel = 16)
            return nn.Sequential(nn.Upsample(scale_factor=2, mode='nearest'), evc_block)
        else:
            return nn.Upsample(scale_factor=2, mode='nearest')

(2) 新建配置 https://github.com/hhaAndroid/mmyolo/blob/bifpn_demo/configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py

_base_ = 'yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py'

model = dict(neck=dict(type='YOLOv5CPAFPN'))

find_unused_parameters = True

(3) 开启分布式训练 cd mmyolo

bash ./tools/dist_train.sh configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py 8

结果

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.386                                                                                             
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.581                                                                                             
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.417                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.218                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.433                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.500                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.316                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.522                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.575                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.382                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.631                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.721  

(4) 特征图可视化 详细用法见: https://mmyolo.readthedocs.io/zh_CN/latest/user_guides/visualization.html

基于训练好的模型,对 EVCBlock 模块的输入前和输出的特征图进行可视化,查看模块效果

cd mmyolo
python demo/featmap_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0] neck.upsample_layers[0][1] --channel-reduction squeeze_mean

image

(5) grad-cam and grad-free cam

grad cam 可视化 EVCBlock 输入前效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0] 

image

grad cam 可视化 EVCBlock 输入后效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][1] 

zidane

AblationCAM 可视化 EVCBlock 输入前效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0]  --method ablationcam

image

AblationCAM 可视化 EVCBlock 输入后效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][1]  --method ablationcam

image

如果觉得有必要合入到 MMYOLO 主分支中,欢迎留言!

@hhaAndroid 您好,我按照您的方法,在mmyolo框架中,添加了YOLOv8CPAFPN(YOLOv8PAFPN),然后将YOLOv8的neck替换为YOLOv8CPAFPN,但运行后会出现如下的错误:

RuntimeError: Given groups=1, weight of size [256, 256, 7, 7], expected input[4, 512, 40, 40] to have 256 channels, but got 512 channels instead

请问该如何解决呢?

WangBingJian233 commented 2 months ago

作者好@QY1994-0919:

我基于你的代码快速在 YOLOv5 中试了下,在 MSCOCO val 数据集上性能为 38.6,没有加之前是 37.6,正好涨 1 个点。

我是基于 MMYOLO 进行实验,配置和代码见个人分支: https://github.com/hhaAndroid/mmyolo/tree/bifpn_demo (请忽略分支名,瞎写的)

具体做法是:

(1) 新建 https://github.com/hhaAndroid/mmyolo/blob/bifpn_demo/mmyolo/models/necks/yolov5_cpafpn.py

核心代码非常简单:

@MODELS.register_module()
class YOLOv5CPAFPN(YOLOv5PAFPN):

    def build_upsample_layer(self, idx: int) -> nn.Module:
        """build upsample layer."""
        if idx == len(self.in_channels) - 1:
            evc_block=EVCBlock(make_divisible(self.out_channels[idx-1], self.widen_factor),
                     make_divisible(self.out_channels[idx-1], self.widen_factor),
                     channel_ratio = 4, base_channel = 16)
            return nn.Sequential(nn.Upsample(scale_factor=2, mode='nearest'), evc_block)
        else:
            return nn.Upsample(scale_factor=2, mode='nearest')

(2) 新建配置 https://github.com/hhaAndroid/mmyolo/blob/bifpn_demo/configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py

_base_ = 'yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py'

model = dict(neck=dict(type='YOLOv5CPAFPN'))

find_unused_parameters = True

(3) 开启分布式训练 cd mmyolo

bash ./tools/dist_train.sh configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py 8

结果

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.386                                                                                             
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.581                                                                                             
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.417                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.218                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.433                                                                                              
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.500                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.316                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.522                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.575                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.382                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.631                                                                                              
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.721  

(4) 特征图可视化 详细用法见: https://mmyolo.readthedocs.io/zh_CN/latest/user_guides/visualization.html

基于训练好的模型,对 EVCBlock 模块的输入前和输出的特征图进行可视化,查看模块效果

cd mmyolo
python demo/featmap_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0] neck.upsample_layers[0][1] --channel-reduction squeeze_mean

图像

(5) Grad-Cam 和 Grad-Free Cam

grad cam 可视化 EVCBlock 输入前效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0] 

图像

grad cam 可视化 EVCBlock 输入后效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][1] 

zidane

AblationCAM 可视化 EVCBlock 输入前效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][0]  --method ablationcam

image

AblationCAM 可视化 EVCBlock 输入后效果

 python demo/boxam_vis_demo.py image/zidane.jpg configs/yolov5/yolov5_s-v61_cpafpn_syncbn_fast_8xb16-300e_coco.py epoch_300.pth --target-layers neck.upsample_layers[0][1]  --method ablationcam

image

如果觉得有必要合入到 MMYOLO 主分支中,欢迎留言!

Hello, how to use grad cam or AblationCAM to visualize the pre-input and output feature maps of EVCBlock module? Detailed usage see: your "https://mmyolo.readthedocs.io/zh_CN/latest/user_guides/visualization.html" has failed, are looking forward to your reply