Closed saurabhmali1 closed 1 week ago
Has the pre trained model been imported correctly? If so, you can use the model to simply infer the image and see if it converges during normal training. If not, check if the address of the pre trained model is correct.
@UserWangZz Yes it is imported correctly. When I try to train my pretrained model it starts from good accuracy but when i try to train en_PP-OCRv3_rec with given config and en_dict, it starts from 0 accuracy.
You can first use the eval script to evaluate the performance of the en_PP-OCRv3_rec model on your data set
i am not find error in your config file
Is there any recommended size or quality of images while generating custom rec dataset? I am working on old blurry documents.
This is an empirical question. We recommend that the width of the input image should not be too large, and if the data set image is blurry, it may affect the recognition results.
@UserWangZz What should be the values of image_shape ? Is it max height, width of dataset image ? When ever I try to change it [3,50,320] it throws assertion error. These are some examples of my dataset images
en_PP-OCRv3_rec_train fine tuning not working. Accuracy always starts from 0. Even after 500 epoch with 1500 dataset fine tuned model generate bad results than original trained model.
Config: Global: debug: false use_gpu: true epoch_num: 300 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/v3_en_mobile save_epoch_step: 5 eval_batch_step: [0, 2000] cal_metric_during_train: true pretrained_model: \rec\en\en_PP-OCRv3_rec_train\best_accuracy checkpoints: save_inference_dir: use_visualdl: false infer_img: doc/imgs_words/ch/word_1.jpg character_dict_path: PaddleOCR\ppocr\utils\en_dict.txt max_text_length: &max_text_length 100 infer_mode: true use_space_char: true distributed: true save_res_path: ./output/rec/predicts_ppocrv3_en.txt
Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Cosine learning_rate: 0.001 warmup_epoch: 5 regularizer: name: L2 factor: 3.0e-05
Architecture: model_type: rec algorithm: SVTR_LCNet Transform: Backbone: name: MobileNetV1Enhance scale: 0.5 last_conv_stride: [1, 2] last_pool_type: avg last_pool_kernel_size: [2, 2] Head: name: MultiHead head_list:
Loss: name: MultiLoss loss_config_list:
PostProcess:
name: CTCLabelDecode
Metric: name: RecMetric main_indicator: acc ignore_space: False
Train: dataset: name: SimpleDataSet data_dir: dataset\rec\img ext_op_transform_idx: 1 label_file_list: