PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.71k stars 2.87k forks source link

在使用CDLA中文版面分析数据集进行训练时,预训练模型加载完毕后发生数组越界错误。 #7969

Closed Fissionalist closed 1 year ago

Fissionalist commented 1 year ago

问题确认 Search before asking

Bug组件 Bug Component

Training

Bug描述 Describe the Bug

数据集来自CDLA,使用labelme提供的labelme2coco.py将数据集转换为coco格式,转换后的数据集格式如下:

|-CDLA_DATASET
  |- train_coco
     |- JPEGImages
        |- train_0001.jpg
        |- train_0002.jpg
        |- ...
     |- Visualization
        |- train_0001.jpg
        |- train_0002.jpg
        |- ...
     |- annotations.json
  |- val_coco
     |- JPEGImages
     |- Visualization
     |- annotations.json

训练使用的配置文件为configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml,修改后的部分信息如下:


pretrain_weights: pretrained_model/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams
weights: output/picodet_lcnet_x1_0_layout_cdla/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 10
snapshot_epoch: 1
epoch: 100

...

metric: COCO
num_classes: 10

TrainDataset:
  !COCODataSet
    image_dir: train_coco
    anno_path: train_coco/annotations.json
    dataset_dir: ./dataset/CDLA_DATASET/
    data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']

EvalDataset:
  !COCODataSet
    image_dir: val_coco
    anno_path: val_coco/annotations.json
    dataset_dir: ./dataset/CDLA_DATASET/

TestDataset:
  !ImageFolder
    anno_path: ./dataset/CDLA_DATASET/val_coco/annotations.json

预训练模型来自PP-Structure 系列模型列表。 启动训练后,在加载完模型参数后即发生数组越界错误,报错信息如下:

  File "tools/train.py", line 202, in <module>
    main()
  File "tools/train.py", line 198, in main
    run(FLAGS, cfg)
  File "tools/train.py", line 151, in run
    trainer.train(FLAGS.eval)
  File "/home/ubuntu/PaddleDetection/ppdet/engine/trainer.py", line 539, in train
    outputs = model(data)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/ubuntu/PaddleDetection/ppdet/modeling/architectures/meta_arch.py", line 60, in forward
    out = self.get_loss()
  File "/home/ubuntu/PaddleDetection/ppdet/modeling/architectures/picodet.py", line 79, in get_loss
    loss_gfl = self.head.get_loss(head_outs, self.inputs)
  File "/home/ubuntu/PaddleDetection/ppdet/modeling/heads/simota_head.py", line 381, in get_loss
    gt_box, gt_label)
  File "/home/ubuntu/PaddleDetection/ppdet/modeling/heads/simota_head.py", line 113, in _get_target_single
    flatten_bbox, gt_bboxes, gt_labels)
  File "/home/ubuntu/PaddleDetection/ppdet/modeling/assigners/simota_assigner.py", line 200, in __call__
    )] = pairwise_ious.reshape([-1])
IndexError: index 10 is out of bounds for axis 1 with size 10

GPT3.5-turbo提供的解决方案(未能解决):

  • 检查数据集中的标注框和类别数量是否正确,并确保数据集中的每个图像都有至少一个标注框和类别。
  • 检查模型配置文件中的参数是否正确设置,包括类别数量、训练和测试的批量大小等等。
  • 检查代码版本是否与使用的数据集和模型版本兼容。

复现环境 Environment

Bug描述确认 Bug description confirmation

是否愿意提交PR? Are you willing to submit a PR?

Fissionalist commented 1 year ago

按照默认的配置训练publaynet英文数据集(class_num=5)没有问题,换成中文数据集(class_num=10)就会报错。

JayTing511 commented 1 year ago

我跟你遇到了一样的错误,想知道解决办法

Fissionalist commented 1 year ago

我跟你遇到了一样的错误,想知道解决办法

我已在上条评论里关联到另一个issue,里面说明了问题原因。

JayTing511 commented 1 year ago

感谢,已发现原因。 只是第一类是背景(background)一定要算上吗?我曾经想过去掉,但是在转化成coco数据集那里会报错。

Fissionalist commented 1 year ago

感谢,已发现原因。 只是第一类是背景(background)一定要算上吗?我曾经想过去掉,但是在转化成coco数据集那里会报错。

应该是,未标注的背景部分严格来说也算一类,要纳入训练范畴的。