[Other General Issues]如何更换模型里的neck部分？

RhythmF commented 2 years ago

比如说我想把ppyolo中的yolofpn更换为BIFPN（其在modeling文件中已存在），应该修改哪些yml文件？谢谢~

heavengate commented 2 years ago

先修改配置文件里的neck部分使用BiFPN段，例如 https://github.com/PaddlePaddle/PaddleDetection/blob/5e9fc1ffc45821d1344048006f06ab96276d91e6/configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml#L9 再添加BiFPN段的配置，或者将原来PPYOLOFPN的配置修改成BiFPN端的配置 https://github.com/PaddlePaddle/PaddleDetection/blob/5e9fc1ffc45821d1344048006f06ab96276d91e6/configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml#L22 修改过程中需要保证配置使输入输出特征图的shape跟前面的backbone和后面的head能匹配上

RhythmF commented 2 years ago

先修改配置文件里的neck部分使用BiFPN段，例如

https://github.com/PaddlePaddle/PaddleDetection/blob/5e9fc1ffc45821d1344048006f06ab96276d91e6/configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml#L9

再添加BiFPN段的配置，或者将原来PPYOLOFPN的配置修改成BiFPN端的配置 https://github.com/PaddlePaddle/PaddleDetection/blob/5e9fc1ffc45821d1344048006f06ab96276d91e6/configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml#L22

修改过程中需要保证配置使输入输出特征图的shape跟前面的backbone和后面的head能匹配上

按照要求修改了yml文件配置之后，训练报错 AssertionError: the module BIFPN is not registered 是还要改哪部分的内容吗？

RhythmF commented 2 years ago

@heavengate 请大佬解答一下，谢谢！

nemonameless commented 2 years ago

请提供一下你的代码分支，方便查看代码debug。

RhythmF commented 2 years ago

@nemonameless 运行环境为百度的AI Studio的BML ,训练代码为 !python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_voc.yml.yml --eval --use_vdl=True --vdl_log_dir="./output"

只修改了yml了文件 PaddleDetection/configs/ppyolo/base/ppyolov2_r50vd_dcn.yml 中的第9行 neck: PPYOLOPAN 修改为 neck: BIFPN 与第22行 PPYOLOPAN: drop_block: true block_size: 3 keep_prob: 0.9 spp: true 改为 BIFPN: norm_type: bn num_stacks: 1 act: swish num_extra_levels: 2

报错： Traceback (most recent call last): File "tools/train.py", line 171, in main() File "tools/train.py", line 167, in main run(FLAGS, cfg) File "tools/train.py", line 118, in run trainer = Trainer(cfg, mode='train') File "/home/aistudio/PaddleDetection/ppdet/engine/trainer.py", line 90, in init self.model = create(cfg.architecture) File "/home/aistudio/PaddleDetection/ppdet/core/workspace.py", line 238, in create cls_kwargs.update(cls.from_config(config, kwargs)) File "/home/aistudio/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 66, in from_config neck = create(cfg['neck'], kwargs) File "/home/aistudio/PaddleDetection/ppdet/core/workspace.py", line 215, in create "the module {} is not registered".format(name) AssertionError: the module BIFPN is not registered

nemonameless commented 2 years ago

https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/configs/ppyolo/_base_/ppyolov2_r50vd_dcn.yml#L9 这里也要改

YOLOv3:
  backbone: ResNet
  neck: BiFPN
  yolo_head: YOLOv3Head
  post_process: BBoxPostProcess

另外BiFPN i是小写字母

RhythmF commented 2 years ago

@nemonameless

已修改然后报错 Traceback (most recent call last): File "tools/train.py", line 171, in main() File "tools/train.py", line 167, in main run(FLAGS, cfg) File "tools/train.py", line 127, in run trainer.train(FLAGS.eval) File "/home/aistudio/PaddleDetection/ppdet/engine/trainer.py", line 398, in train outputs = model(data) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 914, in call outputs = self.forward(*inputs, *kwargs) File "/home/aistudio/PaddleDetection/ppdet/modeling/architectures/meta_arch.py", line 54, in forward out = self.get_loss() File "/home/aistudio/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 121, in get_loss return self._forward() File "/home/aistudio/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 80, in _forward neck_feats = self.neck(body_feats, self.for_mot) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 914, in call outputs = self.forward(inputs, **kwargs) TypeError: forward() takes 2 positional arguments but 3 were given

nemonameless commented 2 years ago

参数个数没对上。参考 https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/ppdet/modeling/necks/yolo_fpn.py#L444 去修改 https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/ppdet/modeling/necks/bifpn.py#L289 就是加个 for_mot=False

RhythmF commented 2 years ago

@nemonameless 加了 for_mot=False

报错 Traceback (most recent call last): File "tools/train.py", line 171, in main() File "tools/train.py", line 167, in main run(FLAGS, cfg) File "tools/train.py", line 127, in run trainer.train(FLAGS.eval) File "/home/aistudio/PaddleDetection/ppdet/engine/trainer.py", line 398, in train outputs = model(data) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 914, in call outputs = self.forward(*inputs, *kwargs) File "/home/aistudio/PaddleDetection/ppdet/modeling/architectures/meta_arch.py", line 54, in forward out = self.get_loss() File "/home/aistudio/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 121, in get_loss return self._forward() File "/home/aistudio/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 88, in _forward yolo_losses = self.yolo_head(neck_feats, self.inputs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 914, in call outputs = self.forward(inputs, **kwargs) File "/home/aistudio/PaddleDetection/ppdet/modeling/heads/yolo_head.py", line 87, in forward assert len(feats) == len(self.anchors) AssertionError

RhythmF commented 2 years ago

@heavengate 感谢此前不厌其烦的解答，希望可以继续得到帮助，解答上述问题。谢谢！

nemonameless commented 2 years ago

请提供一下你的代码分支，方便我们帮查看代码debug。

RhythmF commented 2 years ago

@nemonameless 不好意思，不太会用github，请问是不是这样的？ https://github.com/RhythmF/PaddleDetection

一共只修改了 https://github.com/RhythmF/PaddleDetection/blob/release/2.3/configs/ppyolo/_base_/ppyolov2_r50vd_dcn.yml#L9 https://github.com/RhythmF/PaddleDetection/blob/release/2.3/configs/ppyolo/_base_/ppyolov2_r50vd_dcn.yml#L22-L26 https://github.com/RhythmF/PaddleDetection/blob/release/2.3/ppdet/modeling/necks/bifpn.py#L289

这三个地方

nemonameless commented 2 years ago

ppyolo 只有3个head，resnet backbone一般出3层特征送入yolo_fpn。而bifpn原本是 EfficientNet为backbone的，出backbone的3层后还加了额外的两层输出，所以是总共5层送入bifpn里的。此外代码里普通yolo_fpn出来是分辨率小的在先，如20x20 40x40 80x80，而代码里bifpn出来的特征是分辨率大的在先，你改下倒序即可。但是，只能训通，不保证能训的很高。

改动在此：

BiFPN:
  norm_type: bn
  num_stacks: 1
  act: swish
  num_extra_levels: 0 #2

和 https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/necks/bifpn.py#L300

        # return fpn_feats
        return fpn_feats[::-1]

RhythmF commented 2 years ago

@nemonameless 十分感谢解答，能跑通了。还想再问一个问题，训练的时候为什么会出现loss=Nan，我把学习率降低后，也只是稍微延后了出现nan的轮次

PaddlePaddle / PaddleDetection

[Other General Issues]如何更换模型里的neck部分？ #4968