Closed Yosiiiiiiiiiiiiiiii closed 1 year ago
is the dict.txt you adjust the same as the txt in your command with "Global.character_dict_path=ppocr/utils/ic15_dict.txt"?
@drenched9 I add ic15_dict.txt with more english character.
@Yosiiiiiiiiiiiiiiii Hi man, I have a question did you disable RecConAug on purpose? and what was the training data size you used for training? I can see the training accuracy of your model doing pretty well. I'm training my model on a large 9M dataset which usually gets high accuracy in Training using Models like vanilla CRNN or SAR (above 95%) but when using PPOCRv3 the accuracy drops to 83%.
hi @bely66
"RecConAug on purpose?" >> I didn't do anything. I was training on CRNN, following rec_icdar15_train.yml "I can see the training accuracy of your model doing pretty well." >> no it was not >> it is overfitting so i add 64k sample on training set and the accuracy was above 98%. I can't make it train on PPOCRV3. How did you do that? I posted my issue here: https://github.com/PaddlePaddle/PaddleOCR/issues/8178 please help if you can ^^
@Yosiiiiiiiiiiiiiiii The model is seriously overfitting. It is recommended to try to load the pre-trained model and reduce the learning rate( Try reducing it to 0.0001)
The training seem to be ok. The final epoch acc: 0.988281 , but the best metric, acc: 0.0 What's wrong with that? I used my custom dataset and I adjust dict.txt
Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: learning_rate: 0.001 regularizer: name: 'L2' factor: 0.00001
Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: MobileNetV3 scale: 0.5 model_name: large Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 96 Head: name: CTCHead fc_decay: 0.00001
Loss: name: CTCLoss
PostProcess: name: CTCLabelDecode
Metric: name: RecMetric main_indicator: acc
Train: dataset: name: SimpleDataSet data_dir: ./train_data/custom_dataset/train/ label_file_list: ["./train_data/custom_dataset/rec_gt_train.txt"] transforms:
Eval: dataset: name: SimpleDataSet data_dir: ./train_data/custom_dataset/test label_file_list: ["./train_data/custom_dataset/rec_gt_test.txt"] transforms:
and I ran this
and I got this result