PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.67k stars 2.87k forks source link

PPYOLOE+设置FPN-Stride为[64,32,16,8]时Eval报错 #7334

Open lxgyChen opened 1 year ago

lxgyChen commented 1 year ago

问题确认 Search before asking

Bug组件 Bug Component

Validation

Bug描述 Describe the Bug

PPYOLOE+设置FPN-Stride为[64,32,16,8]时,可以训练,但Eval时报错。设置[32,16,8,4]时训练测试都没问题。 ppyoloe_plus_crn_p6.yml:

CSPResNet:
  return_idx: [0, 1, 2, 3]

CustomCSPPAN:
  out_channels: [768, 384, 192, 64]

PPYOLOEHead:
  fpn_strides: [64, 32, 16, 8]

报错信息:

Traceback (most recent call last):
  File "/home/user/code/PaddleDetection/tools/train.py", line 172, in <module>
    main()
  File "/home/user/code/PaddleDetection/tools/train.py", line 168, in main
    run(FLAGS, cfg)
  File "/home/user/code/PaddleDetection/tools/train.py", line 132, in run
    trainer.train(FLAGS.eval)
  File "/home/user/code/PaddleDetection/ppdet/engine/trainer.py", line 564, in train
    self._eval_with_loader(self._eval_loader)
  File "/home/user/code/PaddleDetection/ppdet/engine/trainer.py", line 595, in _eval_with_loader
    outs = self.model(data)
  File "/home/user/anaconda3/envs/paddle/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "/home/user/anaconda3/envs/paddle/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "/home/user/code/PaddleDetection/ppdet/modeling/architectures/meta_arch.py", line 75, in forward
    outs.append(self.get_pred())
  File "/home/user/code/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 127, in get_pred
    return self._forward()
  File "/home/user/code/PaddleDetection/ppdet/modeling/architectures/yolo.py", line 117, in _forward
    bbox, bbox_num = self.yolo_head.post_process(
  File "/home/user/code/PaddleDetection/ppdet/modeling/heads/ppyoloe_head.py", line 374, in post_process
    pred_bboxes = batch_distance2bbox(anchor_points, pred_dist)
  File "/home/user/code/PaddleDetection/ppdet/modeling/bbox_utils.py", line 478, in batch_distance2bbox
    x1y1 = -lt + points
  File "/home/user/anaconda3/envs/paddle/lib/python3.9/site-packages/paddle/fluid/dygraph/math_op_patch.py", line 299, in __impl__
    return math_op(self, other_var, 'axis', axis)
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 34000, 2] and the shape of Y = [8500, 2]. Received [34000] in X is not equal to [8500] in Y at i:1.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/phi/kernels/funcs/common_shape.h:84)
  [operator < elementwise_add > error]

复现环境 Environment

Bug描述确认 Bug description confirmation

是否愿意提交PR? Are you willing to submit a PR?

jerrywgz commented 1 year ago

可以打印下ppyoloe head中forward eval中对应reg_dist_list和anchor_points的shape是否一致

lxgyChen commented 1 year ago

可以打印下ppyoloe head中forward eval中对应reg_dist_list和anchor_points的shape是否一致 不一致,设置模型输入为640*640,Stride为[64,32,16,8]时,打印输出:

[INFO] [ppyoloe_head] [forward_eval] anchor_points.shape = [8500, 2] stride_tensor.shape = [8500, 1]
feat.shape=
[1, 576, 20, 20]
[1, 288, 40, 40]
[1, 144, 80, 80]
[1, 48, 160, 160]
[INFO] [ppyoloe_head] [forward_eval] cls_score_list.shape = [1, 80, 34000]
[INFO] [ppyoloe_head] [forward_eval] reg_dist_list.shape = [1, 34000, 4]

我看了在Stride为[32,16,8]的正常情况下eg_dist_list和anchor_points都是34000的。是不是Stride最大只支持32?Stride为64时最小的feat.shape应该是10*10的

nemonameless commented 1 year ago
CSPResNet:
  return_idx: [0, 1, 2, 3]

只是多加了一个backbone的P2层输出,从浅到深对应stride分别[4,8,16,32]而已,这样改只能叫P2,最深只有res5对应stride就是32最大了。fpn_strides: [32, 16, 8, 4]这样训练和测试才是对的。训练时fpn_strides错了不会报错,只是训的不对。

P6是要backbone后再加一层res block的,需要改backbone的代码。 可以参照 csp_darknet的P6 https://github.com/PaddlePaddle/PaddleYOLO/blob/release/2.5/ppdet/modeling/backbones/csp_darknet.py#L318

lxgyChen commented 1 year ago

@nemonameless 你们没试过P6的模型吗,visdrone里有P2的config,可以用object365的预训练模型,但是P6改完也没有对应可用的预训练模型啊