PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Apache License 2.0
39.34k stars 7.35k forks source link

en_PP-OCRv3_rec_train fine tuning not working. Accuracy always starts from 0. Even after 500 epoch with 1500 dataset fine tuned model generate bad results than original trained model. #12059

Closed saurabhmali1 closed 1 week ago

saurabhmali1 commented 3 weeks ago

en_PP-OCRv3_rec_train fine tuning not working. Accuracy always starts from 0. Even after 500 epoch with 1500 dataset fine tuned model generate bad results than original trained model.

Config: Global: debug: false use_gpu: true epoch_num: 300 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/v3_en_mobile save_epoch_step: 5 eval_batch_step: [0, 2000] cal_metric_during_train: true pretrained_model: \rec\en\en_PP-OCRv3_rec_train\best_accuracy checkpoints: save_inference_dir: use_visualdl: false infer_img: doc/imgs_words/ch/word_1.jpg character_dict_path: PaddleOCR\ppocr\utils\en_dict.txt max_text_length: &max_text_length 100 infer_mode: true use_space_char: true distributed: true save_res_path: ./output/rec/predicts_ppocrv3_en.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Cosine learning_rate: 0.001 warmup_epoch: 5 regularizer: name: L2 factor: 3.0e-05

Architecture: model_type: rec algorithm: SVTR_LCNet Transform: Backbone: name: MobileNetV1Enhance scale: 0.5 last_conv_stride: [1, 2] last_pool_type: avg last_pool_kernel_size: [2, 2] Head: name: MultiHead head_list:

Loss: name: MultiLoss loss_config_list:

PostProcess:
name: CTCLabelDecode

Metric: name: RecMetric main_indicator: acc ignore_space: False

Train: dataset: name: SimpleDataSet data_dir: dataset\rec\img ext_op_transform_idx: 1 label_file_list:

UserWangZz commented 3 weeks ago

Has the pre trained model been imported correctly? If so, you can use the model to simply infer the image and see if it converges during normal training. If not, check if the address of the pre trained model is correct.

saurabhmali1 commented 3 weeks ago

@UserWangZz Yes it is imported correctly. When I try to train my pretrained model it starts from good accuracy but when i try to train en_PP-OCRv3_rec with given config and en_dict, it starts from 0 accuracy.

UserWangZz commented 2 weeks ago

You can first use the eval script to evaluate the performance of the en_PP-OCRv3_rec model on your data set

UserWangZz commented 2 weeks ago

i am not find error in your config file

saurabhmali1 commented 2 weeks ago

Is there any recommended size or quality of images while generating custom rec dataset? I am working on old blurry documents.

UserWangZz commented 2 weeks ago

This is an empirical question. We recommend that the width of the input image should not be too large, and if the data set image is blurry, it may affect the recognition results.

saurabhmali1 commented 1 week ago

@UserWangZz What should be the values of image_shape ? Is it max height, width of dataset image ? When ever I try to change it [3,50,320] it throws assertion error. These are some examples of my dataset images image_19 image_33 image_66 img_18 img_20 img_30 img_31 img_46