PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.26k stars 7.82k forks source link

low accuracy of recognition model after 2000 epoch training #3773

Closed matthew77777 closed 1 year ago

matthew77777 commented 3 years ago

Hello

I'm trying to create a custom text recognition model with over 4000 images(icdar2015) following this to create the custom model with a custom dictionary.

It's increasing the acc while training and losing loss, however, at the end of the training, the acc drops so low like below.

best_accu

Here below is the command and made. paddle_rec_command

And here is the YAML file.

Global:
  use_gpu: true
  epoch_num: 300
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: /content/drive/MyDrive/project/ocr/models/new_model/rec_icdar15_train/03
  save_epoch_step: 3
  # evaluation is run every 2000 iterations
  eval_batch_step: [0, 1000]
  cal_metric_during_train: True
  pretrained_model: /content/drive/MyDrive/project/ocr/models/created_model/en_number_mobile_v2.0_rec/best_accuracy
  #checkpoints: /content/drive/MyDrive/project/ocr/models/created_model/en_number_mobile_v2.0_rec/best_accuracy
  #checkpoints: /content/drive/MyDrive/project/ocr/models/original_model/ch_ppocr_server_v2.0_rec_pre/best_accuracy
  save_inference_dir: /content/drive/MyDrive/project/ocr/PaddleOCR/output/rec/inference
  use_visualdl: False
  infer_img: /content/drive/MyDrive/project/ocr/PaddleOCR/doc/imgs_words_en/word_10.png
  # for data or label process
  #character_dict_path: /content/drive/MyDrive/project/ocr/PaddleOCR/ppocr/utils/en_dict.txt
  character_dict_path: /content/drive/MyDrive/project/ocr/PaddleOCR/ppocr/utils/custom_dict.txt
  #character_dict_path: /content/drive/MyDrive/project/ocr/PaddleOCR/ppocr/utils/ic15_dict.txt
  character_type: EN
  max_text_length: 25
  infer_mode: False
  use_space_char: True
  save_res_path: /content/drive/MyDrive/project/ocr/models/new_model/rec_icdar15_train/rec/predicts_ic15.txt
  #distort: true

Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    learning_rate: 0.001
  regularizer:
    name: 'L2'
    factor: 0

Architecture:
  model_type: rec
  algorithm: CRNN
  Transform:
  Backbone:
    name: MobileNetV3
    scale: 0.5
    model_name: large
  Neck:
    name: SequenceEncoder
    encoder_type: rnn
    hidden_size: 96
  Head:
    name: CTCHead
    fc_decay: 0

Loss:
  name: CTCLoss

PostProcess:
  name: CTCLabelDecode

Metric:
  name: RecMetric
  main_indicator: acc

Train:
  dataset:
    name: SimpleDataSet
    data_dir: /content/drive/MyDrive/project/ocr/datasets/train_data/ic15_data/train
    label_file_list: ["/content/drive/MyDrive/project/ocr/datasets/train_data/ic15_data/gt_train.txt"]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - CTCLabelEncode: # Class handling label
      - RecResizeImg:
          image_shape: [3, 32, 100]
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: True
    batch_size_per_card: 16
    drop_last: True
    num_workers: 4
    use_shared_memory: False

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /content/drive/MyDrive/project/ocr/datasets/train_data/ic15_data/
    label_file_list: ["/content/drive/MyDrive/project/ocr/datasets/train_data/ic15_data/rec_gt_test.txt"]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - CTCLabelEncode: # Class handling label
      - RecResizeImg:
          image_shape: [3, 32, 100]
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 32
    num_workers: 4
    use_shared_memory: False

Please someone give me advice to create a good recognition model.

Thank you so much,

br,

fadamsyah commented 3 years ago

Hi @matthew77777 . Have you solved the issue?

Your model works well on the train data (because the accuracy is 1.0), but works extremely bad on the validation data. It seems that your model overfits the trainset, cmiiw. I'm wondering how many data we need to get a good model :)

fadamsyah commented 3 years ago

Anw @matthew77777 . Because you use a custom dictionary, I think you need to set character_type: ch instead of EN as explained in here.

matthew77777 commented 3 years ago

Hi @fadamsyah

Thank you for the comments. I see so it was overfitting after all.

Actually, I was using 5000 images for training and I increased the number to 80000. Now I got a reasonable accuracy though, still not good enough.

I recommend you to try with more than 100000 images for training.

Thank you so much !

br,

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.