PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
43.55k stars 7.77k forks source link

使用自己的数据集训练版面分析有差异 #11711

Closed beetter closed 4 months ago

beetter commented 7 months ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

TingquanGao commented 7 months ago

想问下为什么要fine-tune呢,是我们提供的预训练模型效果较差吗?fine-tune训练的超参数是用的默认配置吗?

beetter commented 7 months ago

想问下为什么要fine-tune呢,是我们提供的预训练模型效果较差吗?fine-tune训练的超参数是用的默认配置吗?

测试了效果不太好,我的排版没有顺序,有些会漏识别或者错识别。训练的超参是默认的configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml文件,使用labelme标注的多边形数据,使用x2coco.py转换成训练集和测试集,yml文件配置如下: BASE: [ '../../../../runtime.yml', '../../base/picodet_esnet.yml', '../../base/optimizer_100e.yml', '../../base/picodet_640_reader.yml', ]

pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_0_pretrained.pdparams weights: output/picodet_lcnet_x1_0_layout/model_final find_unused_parameters: True use_ema: true cycle_epoch: 10 snapshot_epoch: 1 epoch: 100

PicoDet: backbone: LCNet neck: CSPPAN head: PicoHead

LCNet: scale: 1.0 feature_maps: [3, 4, 5]

metric: COCO num_classes: 15

TrainDataset: !COCODataSet image_dir: train anno_path: annotations/instance_train.json dataset_dir: ./dataset/publaynet/cocome311/ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']

EvalDataset: !COCODataSet image_dir: val anno_path: annotations/instance_val.json dataset_dir: ./dataset/publaynet/cocome311/

TestDataset: !ImageFolder anno_path: ./dataset/publaynet/cocome311/annotations/instance_val.json

worker_num: 8 eval_height: &eval_height 800 eval_width: &eval_width 608 eval_size: &eval_size [eval_height, eval_width]

TrainReader: sample_transforms:

EvalReader: sample_transforms:

TestReader: inputs_def: image_shape: [1, 3, 800, 608] sample_transforms:

beetter commented 7 months ago

还有就是我使用如下命令导出推理模型时,可以导出模型,并且在output_inference里面生成了4个文件 python tools/export_model.py \ -c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \ -o weights='output/picodet_lcnet_x1_0_layout/best_model.pdparams' \ --output_dir=output_inference 当我把推理模型和paddleOCR结合后就出现了下面这个错误

WechatIMG792
sralvins commented 7 months ago

还有就是我使用如下命令导出推理模型时,可以导出模型,并且在output_inference里面生成了4个文件 python tools/export_model.py -c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml -o weights='output/picodet_lcnet_x1_0_layout/best_model.pdparams' --output_dir=output_inference 当我把推理模型和paddleOCR结合后就出现了下面这个错误 WechatIMG792

got same errors

TingquanGao commented 7 months ago

想问下导出的inference模型文件直接推理测试会报错吗?以及,可以将该文件提供给我们用于排查问题吗?

sralvins commented 7 months ago

想问下导出的inference模型文件直接推理测试会报错吗?以及,可以将该文件提供给我们用于排查问题吗?

picodet_lcnet_x1_0_layout.zip

liyuweihuo commented 7 months ago

请问有最终答复吗?自己数据再fine-tune效果不太好

xuwinrar commented 5 months ago

同问