PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.6k stars 7.85k forks source link

在ICDAR2015上测试,训练准确率高,val特别低 #5443

Closed morestart closed 1 year ago

morestart commented 2 years ago

train的准确率很高,但是val的非常低,数据集是ICDAR2015

训练命令 python3 -m paddle.distributed.launch --log_dir=./debug/ --gpus '0' tools/train.py -c /home/cat/Desktop/PaddleOCR-release-2.4/configs/rec/rec_icdar15_train.yml

[2022/02/11 09:23:15] root INFO: epoch: [600/600], iter: 10199, lr: 0.000500, loss: 0.083509, acc: 0.984371, norm_edit_dis: 0.994565, reader_cost: 0.00006 s, batch_cost: 0.05497 s, samples: 2304, ips: 4191.43876
[2022/02/11 09:23:16] root INFO: save model in ./output/rec/ic15/latest
[2022/02/11 09:23:16] root INFO: save model in ./output/rec/ic15/iter_epoch_600
[2022/02/11 09:23:16] root INFO: best metric, acc: 0.013024595743079717, norm_edit_dis: 0.11359001644052125, fps: 8439.407288056133, best_epoch: 589
INFO 2022-02-11 09:23:19,664 launch.py:311] Local processes completed.
andyjiang1116 commented 2 years ago

有对配置文件做过什么修改吗?看样子是严重过拟合了

morestart commented 2 years ago

@andyjpaddle

Global:
  use_gpu: true
  epoch_num: 600
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/rec/ic15/
  save_epoch_step: 3
  # evaluation is run every 2000 iterations
  eval_batch_step: [0, 2000]
  cal_metric_during_train: True
  pretrained_model:
  checkpoints:
  save_inference_dir: ./
  use_visualdl: False
  infer_img: doc/imgs_words_en/word_10.png
  # for data or label process
  character_dict_path: /home/cat/Desktop/PO/dict.txt
  max_text_length: 25
  infer_mode: False
  use_space_char: False
  save_res_path: ./output/rec/predicts_ic15.txt

Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    learning_rate: 0.0005
  regularizer:
    name: 'L2'
    factor: 0

Architecture:
  model_type: rec
  algorithm: CRNN
  Transform:
  Backbone:
    name: MobileNetV3
    scale: 0.5
    model_name: large
  Neck:
    name: SequenceEncoder
    encoder_type: rnn
    hidden_size: 96
  Head:
    name: CTCHead
    fc_decay: 0

Loss:
  name: CTCLoss

PostProcess:
  name: CTCLabelDecode

Metric:
  name: RecMetric
  main_indicator: acc

Train:
  dataset:
    name: SimpleDataSet
    data_dir: /home/cat/Desktop/PO/icdar2015/recognition/train
    label_file_list: ["/home/cat/Desktop/PO/train.txt"]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - CTCLabelEncode: # Class handling label
      - RecResizeImg:
          image_shape: [3, 32, 100]
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: True
    batch_size_per_card: 256
    drop_last: True
    num_workers: 8
    use_shared_memory: False

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/cat/Desktop/PO/icdar2015/recognition/test
    label_file_list: ["/home/cat/Desktop/PO/val.txt"]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - CTCLabelEncode: # Class handling label
      - RecResizeImg:
          image_shape: [3, 32, 100]
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 256
    num_workers: 4
    use_shared_memory: False
andyjiang1116 commented 2 years ago

数据集是用的icdar2015数据集吗?字典也是对应的官方的吗?我看你主要改了epoch,数据集路径和字典

morestart commented 2 years ago

数据集用了WenmuZhou提供的ICDAR2015数据集 我只是删除了空格和一个特殊字符

andyjiang1116 commented 2 years ago

你可以试试按照默认的epoch数72训练,600轮太多了

morestart commented 2 years ago

72ep有点惨

[2022/02/11 10:45:10] root INFO: epoch: [72/600], iter: 1210, lr: 0.000500, loss: 16.296036, acc: 0.000000, norm_edit_dis: 0.170458, reader_cost: 0.03949 s, batch_cost: 0.06590 s, samples: 1024, ips: 1553.80111
[2022/02/11 10:45:11] root INFO: epoch: [72/600], iter: 1220, lr: 0.000500, loss: 16.415695, acc: 0.000000, norm_edit_dis: 0.167776, reader_cost: 0.00006 s, batch_cost: 0.04214 s, samples: 2560, ips: 6074.67093
[2022/02/11 10:45:11] root INFO: epoch: [72/600], iter: 1223, lr: 0.000500, loss: 16.225019, acc: 0.000000, norm_edit_dis: 0.169712, reader_cost: 0.00002 s, batch_cost: 0.01245 s, samples: 768, ips: 6169.11002
[2022/02/11 10:45:11] root INFO: save model in ./output/rec/ic15/latest
[2022/02/11 10:45:11] root INFO: save model in ./output/rec/ic15/iter_epoch_72
andyjiang1116 commented 2 years ago

你有试过不进行任何改动进行训练吗?

morestart commented 2 years ago

没 我试试 稍等

morestart commented 2 years ago

仅修改数据集路径

[2022/02/11 11:10:41] root INFO: epoch: [72/72], iter: 1210, lr: 0.000500, loss: 15.314871, acc: 0.000000, norm_edit_dis: 0.204160, reader_cost: 0.04008 s, batch_cost: 0.07538 s, samples: 1024, ips: 1358.38837
[2022/02/11 11:10:42] root INFO: epoch: [72/72], iter: 1220, lr: 0.000500, loss: 15.505561, acc: 0.000000, norm_edit_dis: 0.200902, reader_cost: 0.00006 s, batch_cost: 0.06478 s, samples: 2560, ips: 3951.59575
[2022/02/11 11:10:43] root INFO: epoch: [72/72], iter: 1223, lr: 0.000500, loss: 15.115753, acc: 0.000000, norm_edit_dis: 0.204598, reader_cost: 0.00002 s, batch_cost: 0.01803 s, samples: 768, ips: 4260.71118
andyjiang1116 commented 2 years ago

我的意思是数据集不要改,字典也不要改,看一下结果如何

morestart commented 2 years ago

这个配置上的数据不会自动下载吧,直接运行报错

FileNotFoundError: [Errno 2] No such file or directory: './train_data/ic15_data/rec_gt_train.txt'
INFO 2022-02-11 11:17:49,810 launch_utils.py:341] terminate all the procs
ERROR 2022-02-11 11:17:49,810 launch_utils.py:602] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
andyjiang1116 commented 2 years ago

参考这个教程 https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/doc/doc_ch/recognition.md 这个里面包含数据集下载方式

andyjiang1116 commented 2 years ago

这个配置上的数据不会自动下载吧,直接运行报错

FileNotFoundError: [Errno 2] No such file or directory: './train_data/ic15_data/rec_gt_train.txt'
INFO 2022-02-11 11:17:49,810 launch_utils.py:341] terminate all the procs
ERROR 2022-02-11 11:17:49,810 launch_utils.py:602] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.

不会自动下载,需要自己手动下载,参考上面教程

morestart commented 2 years ago

ok

guoge6 commented 2 years ago

兄弟,找到原因了吗?我之前也遇到过这种情况

morestart commented 2 years ago

没 我又换了DTRB的lmdb的数据集 一个ep还没结束acc就到了80%以上,但是泛化很差.... 不知道为什么

guoge6 commented 2 years ago

我也不是特别懂,是不是数据集太小了

------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleOCR" @.>; 发送时间: 2022年2月11日(星期五) 下午3:55 @.>; @.**@.>; 主题: Re: [PaddlePaddle/PaddleOCR] 在ICDAR2015上测试,训练准确率高,val特别低 (Issue #5443)

没 我又换了DTRB的lmdb的数据集 一个ep还没结束acc就到了80%以上,但是泛化很差.... 不知道为什么

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

morestart commented 2 years ago

DTRB的数据集我感觉不小了啊

guoge6 commented 2 years ago

评估的一些参数设置要跟训练时的参数保持一致,会不会是这里出了问题

------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleOCR" @.>; 发送时间: 2022年2月11日(星期五) 下午5:04 @.>; @.**@.>; 主题: Re: [PaddlePaddle/PaddleOCR] 在ICDAR2015上测试,训练准确率高,val特别低 (Issue #5443)

DTRB的数据集我感觉不小了啊

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

morestart commented 2 years ago

我这里的指标是val不是test的 我邮箱morestarforyou@gmail.com 咱俩可以交流一下...

sunshineInmoon commented 2 years ago

@morestart @guoge6 两位你们的问题解决了吗?我现在也遇到了同样的问题。

morestart commented 2 years ago

没 后来自己写了一套 没用paddle

mikiihuang commented 1 year ago

我也遇到了这个问题

gsx1378 commented 1 year ago

对于ICDAR2015数据集,它的label存在大写字母,而ic15_dict.txt只有数字和小写字母,导致label加载有误,可以尝试./ppocr/data/imaug/label_ops.py中添加: """ class CTCLabelEncode(BaseRecLabelEncode): """ Convert between text-label and text-index """

def __init__(self,
             max_text_length,
             character_dict_path=None,
             use_space_char=False,
             **kwargs):
    super(CTCLabelEncode, self).__init__(
        max_text_length, character_dict_path, use_space_char)
    # print(kwargs)  # {'lower': True, 'use_gpu': True}
    # self.lower = kwargs['lower']
    self.lower = True

"""

guoge6 commented 1 year ago

牛的 

guoyunsheng1999 @.***

 

------------------ 原始邮件 ------------------ 发件人: @.***>; 发送时间: 2022年8月24日(星期三) 上午9:34 收件人: "PaddlePaddle/PaddleOCR"; 抄送: "Mention"; 主题: Re: [PaddlePaddle/PaddleOCR] 在ICDAR2015上测试,训练准确率高,val特别低 (Issue #5443)

没 后来自己写了一套 没用paddle

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.