PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.25k stars 7.82k forks source link

模型训练问题:在进行银行卡识别时,部分字体不支持,想通过训练提高识别率,但是一直报错:AssertionError: The length of ratio_list should be the same as the file_list. #6107

Closed webtang closed 2 years ago

webtang commented 2 years ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

我使用这种字体生成了10000张照片,10000张训练,20张验证,具体步骤参照文档:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/doc/doc_ch/recognition.md

具体过程如下:

1、数据准备 train_data.txt 内容格式 文件位置:D:\python-project\train_data.txt D:\python-project\535323891528254.jpg 535323891528254 D:\python-project\377387486821203.jpg 377387486821203 D:\python-project\665787415268091.jpg 665787415268091 评估文件test.txt 内容格式 文件位置D:\PaddleOCR-release-2.4\train_data\test\test.txt D:\PaddleOCR-release-2.4\train_data\test\5563608471724.jpg 5563608471724 D:\PaddleOCR-release-2.4\train_data\test\55624639895957.jpg 55624639895957 D:\PaddleOCR-release-2.4\train_data\test\55636833766700.jpg 55636833766700

2、配置文件 位置:D:\PaddleOCR-release-2.4\configs\rec\rec_icdar15_train.yml 文件内容: Global: use_gpu: false epoch_num: 72 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/rec/ic15/ save_epoch_step: 3

evaluation is run every 2000 iterations

eval_batch_step: [0, 10] cal_metric_during_train: True pretrained_model: D:\PaddleOCR-release-2.4\pretrain_models\rec_r34_vd_none_bilstm_ctc_v2.0_train checkpoints: save_inference_dir: ./ use_visualdl: False infer_img:

for data or label process

character_dict_path: D:\PaddleOCR-release-2.4\train_data\word_dict.txt max_text_length: 25 infer_mode: False use_space_char: False save_res_path: ./output/rec/predicts_ic15.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: learning_rate: 0.0005 regularizer: name: 'L2' factor: 0

Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: MobileNetV3 scale: 0.5 model_name: large Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 96 Head: name: CTCHead fc_decay: 0

Loss: name: CTCLoss

PostProcess: name: CTCLabelDecode

Metric: name: RecMetric main_indicator: acc

Train: dataset: name: SimpleDataSet data_dir: D:\python-project\ label_file_list: D:\python-project\train_data.txt transforms:

Eval: dataset: name: SimpleDataSet data_dir: D:\PaddleOCR-release-2.4\train_data\test label_file_list: D:\PaddleOCR-release-2.4\train_data\test\test.txt transforms:

3、字典文件 文件路径:D:\PaddleOCR-release-2.4\train_data\word_dict.txt 文件内容如下: 1 2 3 4 5 6 7 8 9 0

4、预训练模型

模型路径:D:\PaddleOCR-release-2.4\pretrain_models\rec_r34_vd_none_bilstm_ctc_v2.0_train

目录内容:

.DS_Store best_accuracy.pdopt best_accuracy.pdparams best_accuracy.states train.log

5、开始训练 PS D:\PaddleOCR-release-2.4> python .\tools\train.py -c .\configs\rec\rec_icdar15_train.yml [2022/04/30 21:32:51] root INFO: Architecture : [2022/04/30 21:32:51] root INFO: Backbone : [2022/04/30 21:32:51] root INFO: model_name : large [2022/04/30 21:32:51] root INFO: name : MobileNetV3 [2022/04/30 21:32:51] root INFO: scale : 0.5 [2022/04/30 21:32:51] root INFO: Head : [2022/04/30 21:32:51] root INFO: fc_decay : 0 [2022/04/30 21:32:51] root INFO: name : CTCHead [2022/04/30 21:32:51] root INFO: Neck : [2022/04/30 21:32:51] root INFO: encoder_type : rnn [2022/04/30 21:32:51] root INFO: hidden_size : 96 [2022/04/30 21:32:51] root INFO: name : SequenceEncoder [2022/04/30 21:32:51] root INFO: Transform : None [2022/04/30 21:32:51] root INFO: algorithm : CRNN [2022/04/30 21:32:51] root INFO: model_type : rec [2022/04/30 21:32:51] root INFO: Eval : [2022/04/30 21:32:51] root INFO: dataset : [2022/04/30 21:32:51] root INFO: data_dir : D:\PaddleOCR-release-2.4\train_data\test [2022/04/30 21:32:51] root INFO: label_file_list : D:\PaddleOCR-release-2.4\train_data\test\test.txt [2022/04/30 21:32:51] root INFO: name : SimpleDataSet [2022/04/30 21:32:51] root INFO: transforms : [2022/04/30 21:32:51] root INFO: DecodeImage : [2022/04/30 21:32:51] root INFO: channel_first : False [2022/04/30 21:32:51] root INFO: img_mode : BGR [2022/04/30 21:32:51] root INFO: CTCLabelEncode : None [2022/04/30 21:32:51] root INFO: RecResizeImg : [2022/04/30 21:32:51] root INFO: image_shape : [3, 32, 100] [2022/04/30 21:32:51] root INFO: KeepKeys : [2022/04/30 21:32:51] root INFO: keep_keys : ['image', 'label', 'length'] [2022/04/30 21:32:51] root INFO: loader : [2022/04/30 21:32:51] root INFO: batch_size_per_card : 10 [2022/04/30 21:32:51] root INFO: drop_last : False [2022/04/30 21:32:51] root INFO: num_workers : 4 [2022/04/30 21:32:51] root INFO: shuffle : False [2022/04/30 21:32:51] root INFO: use_shared_memory : False [2022/04/30 21:32:51] root INFO: Global : [2022/04/30 21:32:51] root INFO: cal_metric_during_train : True [2022/04/30 21:32:51] root INFO: character_dict_path : D:\PaddleOCR-release-2.4\train_data\word_dict.txt [2022/04/30 21:32:51] root INFO: checkpoints : None [2022/04/30 21:32:51] root INFO: debug : False [2022/04/30 21:32:51] root INFO: distributed : False [2022/04/30 21:32:51] root INFO: epoch_num : 72 [2022/04/30 21:32:51] root INFO: eval_batch_step : [0, 10] [2022/04/30 21:32:51] root INFO: infer_img : None [2022/04/30 21:32:51] root INFO: infer_mode : False [2022/04/30 21:32:51] root INFO: log_smooth_window : 20 [2022/04/30 21:32:51] root INFO: max_text_length : 25 [2022/04/30 21:32:51] root INFO: pretrained_model : D:\PaddleOCR-release-2.4\pretrain_models\rec_r34_vd_none_bilstm_ctc_v2.0_train [2022/04/30 21:32:51] root INFO: print_batch_step : 10 [2022/04/30 21:32:51] root INFO: save_epoch_step : 3 [2022/04/30 21:32:51] root INFO: save_inference_dir : ./ [2022/04/30 21:32:51] root INFO: save_model_dir : ./output/rec/ic15/ [2022/04/30 21:32:51] root INFO: save_res_path : ./output/rec/predicts_ic15.txt [2022/04/30 21:32:51] root INFO: use_gpu : False [2022/04/30 21:32:51] root INFO: use_space_char : False [2022/04/30 21:32:51] root INFO: use_visualdl : False [2022/04/30 21:32:51] root INFO: Loss : [2022/04/30 21:32:51] root INFO: name : CTCLoss [2022/04/30 21:32:51] root INFO: Metric : [2022/04/30 21:32:51] root INFO: main_indicator : acc [2022/04/30 21:32:51] root INFO: name : RecMetric [2022/04/30 21:32:51] root INFO: Optimizer : [2022/04/30 21:32:51] root INFO: beta1 : 0.9 [2022/04/30 21:32:51] root INFO: beta2 : 0.999 [2022/04/30 21:32:51] root INFO: lr : [2022/04/30 21:32:51] root INFO: learning_rate : 0.0005 [2022/04/30 21:32:51] root INFO: name : Adam [2022/04/30 21:32:51] root INFO: regularizer : [2022/04/30 21:32:51] root INFO: factor : 0 [2022/04/30 21:32:51] root INFO: name : L2 [2022/04/30 21:32:51] root INFO: PostProcess : [2022/04/30 21:32:51] root INFO: name : CTCLabelDecode [2022/04/30 21:32:51] root INFO: Train : [2022/04/30 21:32:51] root INFO: dataset : [2022/04/30 21:32:51] root INFO: data_dir : D:\python-project\ [2022/04/30 21:32:51] root INFO: label_file_list : D:\python-project\train_data.txt [2022/04/30 21:32:51] root INFO: name : SimpleDataSet [2022/04/30 21:32:51] root INFO: transforms : [2022/04/30 21:32:51] root INFO: DecodeImage : [2022/04/30 21:32:51] root INFO: channel_first : False [2022/04/30 21:32:51] root INFO: img_mode : BGR [2022/04/30 21:32:51] root INFO: CTCLabelEncode : None [2022/04/30 21:32:51] root INFO: RecResizeImg : [2022/04/30 21:32:51] root INFO: image_shape : [3, 32, 100] [2022/04/30 21:32:51] root INFO: KeepKeys : [2022/04/30 21:32:51] root INFO: keep_keys : ['image', 'label', 'length'] [2022/04/30 21:32:51] root INFO: loader : [2022/04/30 21:32:51] root INFO: batch_size_per_card : 256 [2022/04/30 21:32:51] root INFO: drop_last : True [2022/04/30 21:32:51] root INFO: num_workers : 8 [2022/04/30 21:32:51] root INFO: shuffle : True [2022/04/30 21:32:51] root INFO: use_shared_memory : False [2022/04/30 21:32:51] root INFO: profiler_options : None [2022/04/30 21:32:51] root INFO: train with paddle 2.2.2 and device CPUPlace Traceback (most recent call last): File ".\tools\train.py", line 148, in main(config, device, logger, vdl_writer) File ".\tools\train.py", line 52, in main train_dataloader = build_dataloader(config, 'Train', device, logger) File "D:\PaddleOCR-release-2.4\ppocr\data__init__.py", line 64, in build_dataloader dataset = eval(module_name)(config, mode, logger, seed) File "D:\PaddleOCR-release-2.4\ppocr\data\simple_dataset.py", line 41, in init ) == data_source_num, "The length of ratio_list should be the same as the file_list." AssertionError: The length of ratio_list should be the same as the file_list. PS D:\PaddleOCR-release-2.4>

一直不知道问题出现在哪里,第一次玩,望大师指点,谢谢!

webtang commented 2 years ago

问题已经解决了,做下记录,以后有类似问题的,仅供参考: 1、问题1,大多数问题是由于配置文件造成的,以下是我的配置文件,仅供参考: Global: use_gpu: false #是否启用gpu训练 epoch_num: 72 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/rec/ic15/ #保存模型的目录 save_epoch_step: 3

evaluation is run every 2000 iterations

eval_batch_step: [0, 10] #这里不要太小,我弄的就太小了。 cal_metric_during_train: True pretrained_model: D:\PaddleOCR-release-2.4\pretrain_models\ch_ppocr_server_v2.0_rec_pre\best_accuracy #一定要指定模型目录下best_accuracy.pdparamsw文件 checkpoints: save_inference_dir: ./ use_visualdl: False infer_img:

for data or label process

character_dict_path: D:\PaddleOCR-release-2.4\train_data\word_dict.txt #自定义字典文件,内容参照与我的问题。 max_text_length: 25 infer_mode: False use_space_char: False save_res_path: ./output/rec/predicts_ic15.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: learning_rate: 0.0005 regularizer: name: 'L2' factor: 0

Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: MobileNetV3 scale: 0.5 model_name: large Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 96 Head: name: CTCHead fc_decay: 0

Loss: name: CTCLoss

PostProcess: name: CTCLabelDecode

Metric: name: RecMetric main_indicator: acc

Train: dataset: name: SimpleDataSet data_dir: D:\python-project\ label_file_list: ["D:\python-project\train_data2.txt"] #这里最好是有[]和“” 号,train_data2.txt中的内容,文件名和结果中间是一个tab,不是空格,如果是空格的话,会报IndexError: list index out of range错误。 transforms:

Eval: dataset: name: SimpleDataSet data_dir: D:\PaddleOCR-release-2.4\train_data\test label_file_list: ["D:\PaddleOCR-release-2.4\train_data\test\test.txt"] transforms:

2、解决RecursionError: maximum recursion depth exceeded while calling a Python object。 参考:https://blog.csdn.net/qq_41320433/article/details/104299296 由于python玩的少,就把图片数量控制在100以内,该搞错就没有了。

3、ZeroDivisionError: float division by zero。 报错的内容是:total_time=0造成的 执行的文件在./tools/program.py中,我把total_time=0.0,改成了total_time=1.0,在文件的389行,估计是在赋值的过程中出现了问题,反正它是用来计算时间的,影响应该不大。

4、日志文件中的这些报错,可以不用理会。 [2022/05/04 19:24:07] root WARNING: The pretrained params backbone.bb_1_0.short._batch_norm.weight not in model [2022/05/04 19:24:07] root WARNING: The pretrained params backbone.bb_1_0.short._batch_norm.bias not in model [2022/05/04 19:24:07] root WARNING: The pretrained params backbone.bb_1_0.short._batch_norm._mean not in model [2022/05/04 19:24:07] root WARNING: The pretrained params backbone.bb_1_0.short._batch_norm._variance not in model

5、找到合适的训练模板。 下载链接:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/doc/doc_ch/models_list.md

6、建议使用GPU训练,CPU实在太慢了。

songjiahao-wq commented 2 years ago

请问是如何解决的,我改掉了image shape但是依然存在问题,

paulhuang815 commented 2 years ago

请问是如何解决的,我改掉了image shape但是依然存在问题,

This works for me!

https://github.com/PaddlePaddle/PaddleOCR/issues/5849#issuecomment-1084164019