Open Pedro69491 opened 3 days ago
The issue you're facing, where the PaddleOCR inference model produces incorrect predictions (e.g., Chinese symbols instead of digits) despite correct training and evaluation results, is a common problem that arises in OCR workflows due to mismatched configurations or issues during model export and inference. Below are the likely causes and steps to resolve the issue:
rec_char_dict_path
rec_char_dict_path
used during training, evaluation, and inference is consistent.digit_dict.txt
should contain only the digits 0-9
, one per line. Double-check that this file does not include any non-numeric characters or extra blank spaces.Example:
0
1
2
3
4
5
6
7
8
9
If there is any discrepancy in the character dictionary, the inference model may produce unexpected characters, such as Chinese symbols, because it interprets the output indices incorrectly.
When exporting the trained model to an inference model, ensure that the correct checkpoints and configurations are used. Verify that the following steps were executed correctly:
export_model.py
script should reference the same configuration file (en_PP-OCRv3_rec.yml
) and checkpoint file (iter_epoch_24.pdparams
) as used during training and evaluation.Global.rec_char_dict_path
parameter in the configuration file should point to your digit_dict.txt
.Command you ran:
python3 tools/export_model.py \
-c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml \
-o Global.checkpoints=./output/v3_en_mobile/iter_epoch_24.pdparams \
Global.save_inference_dir=./inference/rec_digits
Verify the above command and ensure the paths are correct. If there was an issue during export, the inference model may not align with the trained weights.
The issue may lie in the parameters passed to the PaddleOCR
class during inference. Based on your code:
ocr = PaddleOCR(
use_gpu=False,
rec_char_dict_path='./digit_dict.txt',
rec_model_dir="./PaddleOCR/inference/rec_digits"
)
Check the following:
rec_model_dir
should point to the folder containing the exported inference model files (inference.pdiparams
, inference.pdiparams.info
, and inference.pdmodel
).rec_char_dict_path
parameter.If the configuration is incorrect, the model may fail to interpret its predictions properly.
Even though the model achieves 100% accuracy during evaluation, this does not guarantee it will work perfectly during inference. The evaluation step uses the training pipeline, which differs slightly from the inference pipeline in terms of preprocessing and postprocessing. Here are some steps to debug:
infer_rec.py
: Use the infer_rec.py
script to test the exported model directly on your test images. This script uses the same preprocessing pipeline as the training process.
python3 tools/infer_rec.py \
--image_dir=path_to_test_image \
--rec_model_dir=./PaddleOCR/inference/rec_digits \
--rec_char_dict_path=./digit_dict.txt \
--use_gpu=False
Compare the results from infer_rec.py
with those from the PaddleOCR
API. If the results from infer_rec.py
are correct, the issue may lie in how the PaddleOCR
class is configured.
Differences in preprocessing between training and inference can also cause discrepancies. For example:
rec_image_shape
parameter in your configuration file (e.g., en_PP-OCRv3_rec.yml
) should match the input shape expected by the inference model.Verify the rec_image_shape
in your config file (e.g., ["3", "32", "100"]
for recognition) and ensure it matches the preprocessing applied in your inference script.
digit_dict.txt
file across training, evaluation, export, and inference.infer_rec.py
: Run the exported model using the infer_rec.py
script and compare results.Issue #13972: 导出模型inference后,paddleocr命令行执行结果不一致
Discussion #14088: 用自己数据集训练的识别模型的train_model转换为inference_model后识别结果错误
To resolve your issue:
digit_dict.txt
file and ensure it is correctly referenced during all steps.infer_rec.py
to isolate whether the issue lies in the export process or the PaddleOCR
API configuration.If the problem persists, share the outputs of infer_rec.py
and your configuration file for further debugging.
Response generated by 🤖 feifei-bot | chatgpt-4o-latest
python3 tools/infer_rec.py \ -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml \ -o Global.infer_rec_model_dir="./inference/rec_digits" \ Global.infer_img="../digits/valid/0/7616.png" \ Global.use_gpu=False [2024/11/27 15:58:33] ppocr INFO: Architecture : [2024/11/27 15:58:33] ppocr INFO: Backbone : [2024/11/27 15:58:33] ppocr INFO: last_conv_stride : [1, 2] [2024/11/27 15:58:33] ppocr INFO: last_pool_kernel_size : [2, 2] [2024/11/27 15:58:33] ppocr INFO: last_pool_type : avg [2024/11/27 15:58:33] ppocr INFO: name : MobileNetV1Enhance [2024/11/27 15:58:33] ppocr INFO: scale : 0.5 [2024/11/27 15:58:33] ppocr INFO: Head : [2024/11/27 15:58:33] ppocr INFO: head_list : [2024/11/27 15:58:33] ppocr INFO: CTCHead : [2024/11/27 15:58:33] ppocr INFO: Head : [2024/11/27 15:58:33] ppocr INFO: fc_decay : 1e-05 [2024/11/27 15:58:33] ppocr INFO: Neck : [2024/11/27 15:58:33] ppocr INFO: depth : 2 [2024/11/27 15:58:33] ppocr INFO: dims : 64 [2024/11/27 15:58:33] ppocr INFO: hidden_dims : 120 [2024/11/27 15:58:33] ppocr INFO: name : svtr [2024/11/27 15:58:33] ppocr INFO: use_guide : True [2024/11/27 15:58:33] ppocr INFO: SARHead : [2024/11/27 15:58:33] ppocr INFO: enc_dim : 512 [2024/11/27 15:58:33] ppocr INFO: max_text_length : 25 [2024/11/27 15:58:33] ppocr INFO: name : MultiHead [2024/11/27 15:58:33] ppocr INFO: Transform : None [2024/11/27 15:58:33] ppocr INFO: algorithm : SVTR_LCNet [2024/11/27 15:58:33] ppocr INFO: model_type : rec [2024/11/27 15:58:33] ppocr INFO: Eval : [2024/11/27 15:58:33] ppocr INFO: dataset : [2024/11/27 15:58:33] ppocr INFO: data_dir : ../digits/valid [2024/11/27 15:58:33] ppocr INFO: label_file_list : ['../digits/labels/valid.txt'] [2024/11/27 15:58:33] ppocr INFO: name : SimpleDataSet [2024/11/27 15:58:33] ppocr INFO: transforms : [2024/11/27 15:58:33] ppocr INFO: DecodeImage : [2024/11/27 15:58:33] ppocr INFO: channel_first : False [2024/11/27 15:58:33] ppocr INFO: img_mode : BGR [2024/11/27 15:58:33] ppocr INFO: MultiLabelEncode : None [2024/11/27 15:58:33] ppocr INFO: RecResizeImg : [2024/11/27 15:58:33] ppocr INFO: image_shape : [3, 48, 320] [2024/11/27 15:58:33] ppocr INFO: KeepKeys : [2024/11/27 15:58:33] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio'] [2024/11/27 15:58:33] ppocr INFO: loader : [2024/11/27 15:58:33] ppocr INFO: batch_size_per_card : 128 [2024/11/27 15:58:33] ppocr INFO: drop_last : False [2024/11/27 15:58:33] ppocr INFO: num_workers : 4 [2024/11/27 15:58:33] ppocr INFO: shuffle : False [2024/11/27 15:58:33] ppocr INFO: Global : [2024/11/27 15:58:33] ppocr INFO: cal_metric_during_train : True [2024/11/27 15:58:33] ppocr INFO: character_dict_path : ../digit_dict.txt [2024/11/27 15:58:33] ppocr INFO: checkpoints : None [2024/11/27 15:58:33] ppocr INFO: debug : False [2024/11/27 15:58:33] ppocr INFO: distributed : False [2024/11/27 15:58:33] ppocr INFO: epoch_num : 25 [2024/11/27 15:58:33] ppocr INFO: eval_batch_step : [0, 2000] [2024/11/27 15:58:33] ppocr INFO: infer_img : ../digits/valid/0/7616.png [2024/11/27 15:58:33] ppocr INFO: infer_mode : False [2024/11/27 15:58:33] ppocr INFO: infer_rec_model_dir : ./inference/rec_digits [2024/11/27 15:58:33] ppocr INFO: log_smooth_window : 20 [2024/11/27 15:58:33] ppocr INFO: max_text_length : 25 [2024/11/27 15:58:33] ppocr INFO: pretrained_model : None [2024/11/27 15:58:33] ppocr INFO: print_batch_step : 10 [2024/11/27 15:58:33] ppocr INFO: save_epoch_step : 3 [2024/11/27 15:58:33] ppocr INFO: save_inference_dir : None [2024/11/27 15:58:33] ppocr INFO: save_model_dir : ./output/v3_en_mobile [2024/11/27 15:58:33] ppocr INFO: save_res_path : ./output/rec/predicts_ppocrv3_en.txt [2024/11/27 15:58:33] ppocr INFO: use_gpu : False [2024/11/27 15:58:33] ppocr INFO: use_space_char : True [2024/11/27 15:58:33] ppocr INFO: use_visualdl : False [2024/11/27 15:58:33] ppocr INFO: Loss : [2024/11/27 15:58:33] ppocr INFO: loss_config_list : [2024/11/27 15:58:33] ppocr INFO: CTCLoss : None [2024/11/27 15:58:33] ppocr INFO: SARLoss : None [2024/11/27 15:58:33] ppocr INFO: name : MultiLoss [2024/11/27 15:58:33] ppocr INFO: Metric : [2024/11/27 15:58:33] ppocr INFO: ignore_space : False [2024/11/27 15:58:33] ppocr INFO: main_indicator : acc [2024/11/27 15:58:33] ppocr INFO: name : RecMetric [2024/11/27 15:58:33] ppocr INFO: Optimizer : [2024/11/27 15:58:33] ppocr INFO: beta1 : 0.9 [2024/11/27 15:58:33] ppocr INFO: beta2 : 0.999 [2024/11/27 15:58:33] ppocr INFO: lr : [2024/11/27 15:58:33] ppocr INFO: learning_rate : 0.001 [2024/11/27 15:58:33] ppocr INFO: name : Cosine [2024/11/27 15:58:33] ppocr INFO: warmup_epoch : 5 [2024/11/27 15:58:33] ppocr INFO: name : Adam [2024/11/27 15:58:33] ppocr INFO: regularizer : [2024/11/27 15:58:33] ppocr INFO: factor : 3e-05 [2024/11/27 15:58:33] ppocr INFO: name : L2 [2024/11/27 15:58:33] ppocr INFO: PostProcess : [2024/11/27 15:58:33] ppocr INFO: name : CTCLabelDecode [2024/11/27 15:58:33] ppocr INFO: Train : [2024/11/27 15:58:33] ppocr INFO: dataset : [2024/11/27 15:58:33] ppocr INFO: data_dir : ../digits/train/ [2024/11/27 15:58:33] ppocr INFO: ext_op_transform_idx : 1 [2024/11/27 15:58:33] ppocr INFO: label_file_list : ['../digits/labels/train.txt'] [2024/11/27 15:58:33] ppocr INFO: name : SimpleDataSet [2024/11/27 15:58:33] ppocr INFO: transforms : [2024/11/27 15:58:33] ppocr INFO: DecodeImage : [2024/11/27 15:58:33] ppocr INFO: channel_first : False [2024/11/27 15:58:33] ppocr INFO: img_mode : BGR [2024/11/27 15:58:33] ppocr INFO: RecConAug : [2024/11/27 15:58:33] ppocr INFO: ext_data_num : 2 [2024/11/27 15:58:33] ppocr INFO: image_shape : [48, 320, 3] [2024/11/27 15:58:33] ppocr INFO: max_text_length : 25 [2024/11/27 15:58:33] ppocr INFO: prob : 0.5 [2024/11/27 15:58:33] ppocr INFO: RecAug : None [2024/11/27 15:58:33] ppocr INFO: MultiLabelEncode : None [2024/11/27 15:58:33] ppocr INFO: RecResizeImg : [2024/11/27 15:58:33] ppocr INFO: image_shape : [3, 48, 320] [2024/11/27 15:58:33] ppocr INFO: KeepKeys : [2024/11/27 15:58:33] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio'] [2024/11/27 15:58:33] ppocr INFO: loader : [2024/11/27 15:58:33] ppocr INFO: batch_size_per_card : 8 [2024/11/27 15:58:33] ppocr INFO: drop_last : True [2024/11/27 15:58:33] ppocr INFO: num_workers : 4 [2024/11/27 15:58:33] ppocr INFO: shuffle : True [2024/11/27 15:58:33] ppocr INFO: profiler_options : None [2024/11/27 15:58:33] ppocr INFO: train with paddle 2.6.2 and device Place(cpu) [2024/11/27 15:58:33] ppocr INFO: train from scratch [2024/11/27 15:58:33] ppocr INFO: infer_img: ../digits/valid/0/7616.png [2024/11/27 15:58:34] ppocr INFO: result: 76 0.09323589503765106 [2024/11/27 15:58:34] ppocr INFO: success!ied digit_dict and exports and everything looks fine
it should have gave me 0
🔎 Search before asking
🐛 Bug (问题描述)
I have been using PaddleOCR training capabilities in a small dataset of digits, after 25 epochs the accuracy of the model reaches 100%, I then evaluate the model and I get an accuracy of 100% too. The problem is that when I try to test the model on the exact same images I used to eval, I get completely different results, of course Iexported the best weights from the trained recognition model
from PaddleOCR.paddleocr import PaddleOCR
ocr = PaddleOCR( use_gpu=False, rec_char_dict_path='./digit_dict.txt', rec_model_dir="./PaddleOCR/inference/rec_digits", # Path to the saved model )
result = ocr.ocr(image_path)
output: chinese symbol instead of a number
notes: digit_dict refers to a small text file containing the numbers from 0 to 9
I tried to use infer_rec.py but the results were no good at all again with data that we already used to validate, not sure what I should do next.
🏃♂️ Environment (运行环境)
linux: ubuntu 22.04 python3: 10.12
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
python3 tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.use_gpu=False
python3 tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.checkpoints=./output/v3_en_mobile/iter_epoch_24.pdparams Global.use_gpu=False
python3 tools/export_model.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.checkpoints=./output/v3_en_mobile/iter_epoch_24.pdparams Global.save_inference_dir=./inference/rec_digits