best metric, acc: 0.0 on recognition

Yosiiiiiiiiiiiiiiii commented 2 years ago

The training seem to be ok. The final epoch acc: 0.988281 , but the best metric, acc: 0.0 What's wrong with that? I used my custom dataset and I adjust dict.txt

System Environment：google colab This is my yml file


use_gpu: True
epoch_num: 100
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/ic15/
save_epoch_step: 3
# evaluation is run every 2000 iterations
eval_batch_step: [0, 2000]
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir: ./
use_visualdl: False
infer_img: content/drive/MyDrive/PaddleOCr/gen2011/test/sample_301.png
# for data or label process
character_dict_path: ppocr/utils/ic15_dict.txt
max_text_length: 100
infer_mode: False
use_space_char: False
save_res_path: ./output/rec/predicts_ic15.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: learning_rate: 0.001 regularizer: name: 'L2' factor: 0.00001

Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: MobileNetV3 scale: 0.5 model_name: large Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 96 Head: name: CTCHead fc_decay: 0.00001

Loss: name: CTCLoss

PostProcess: name: CTCLabelDecode

Metric: name: RecMetric main_indicator: acc

Train: dataset: name: SimpleDataSet data_dir: ./train_data/custom_dataset/train/ label_file_list: ["./train_data/custom_dataset/rec_gt_train.txt"] transforms:

DecodeImage: # load image img_mode: BGR channel_first: False
CTCLabelEncode: # Class handling label
RecResizeImg: image_shape: [3, 32, 320]
KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: True batch_size_per_card: 256 drop_last: True num_workers: 8 use_shared_memory: False

Eval: dataset: name: SimpleDataSet data_dir: ./train_data/custom_dataset/test label_file_list: ["./train_data/custom_dataset/rec_gt_test.txt"] transforms:

DecodeImage: # load image img_mode: BGR channel_first: False
CTCLabelEncode: # Class handling label
RecResizeImg: image_shape: [3, 32, 320]
KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 256 num_workers: 4 use_shared_memory: False

and I ran this Screen Shot 2565-10-24 at 16 57 45

and I got this result Screen Shot 2565-10-24 at 16 58 45

drenched9 commented 2 years ago

is the dict.txt you adjust the same as the txt in your command with "Global.character_dict_path=ppocr/utils/ic15_dict.txt"?

Yosiiiiiiiiiiiiiiii commented 2 years ago

@drenched9 I add ic15_dict.txt with more english character.

bely66 commented 2 years ago

@Yosiiiiiiiiiiiiiiii Hi man, I have a question did you disable RecConAug on purpose? and what was the training data size you used for training? I can see the training accuracy of your model doing pretty well. I'm training my model on a large 9M dataset which usually gets high accuracy in Training using Models like vanilla CRNN or SAR (above 95%) but when using PPOCRv3 the accuracy drops to 83%.

Yosiiiiiiiiiiiiiiii commented 2 years ago

hi @bely66

"RecConAug on purpose?" >> I didn't do anything. I was training on CRNN, following rec_icdar15_train.yml "I can see the training accuracy of your model doing pretty well." >> no it was not >> it is overfitting so i add 64k sample on training set and the accuracy was above 98%. I can't make it train on PPOCRV3. How did you do that? I posted my issue here: https://github.com/PaddlePaddle/PaddleOCR/issues/8178 please help if you can ^^

tink2123 commented 1 year ago

@Yosiiiiiiiiiiiiiiii The model is seriously overfitting. It is recommended to try to load the pre-trained model and reduce the learning rate( Try reducing it to 0.0001)

PaddlePaddle / PaddleOCR

best metric, acc: 0.0 on recognition #8075