Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: True
batch_size_per_card: 1
drop_last: True
num_workers: 0
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
在训练手写文字识别时遇到acc精度一直都是0的状态 没有丝毫提示,使用的预训练库为paddleocr官方提供的ch_ppocr_server_v2.0_rec_pre 这是我的yum配置文件: `Global: use_gpu: False epoch_num: 100 log_smooth_window: 20 print_batch_step: 1 save_model_dir: ./output/rec_chinese_common_v2.0 save_epoch_step: 3
evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 2000]
if pretrained_model is saved in static mode, load_static_weights must set to True
cal_metric_during_train: True pretrained_model: D:\java\PaddleOCR-train\trains\chPPocr\best_accuracy checkpoints: save_inference_dir: use_visualdl: False infer_img: doc/imgs_words/ch/word_1.jpg
for data or label process
character_dict_path: D:\java\PaddleOCR-train\ppocr\utils\ppocr_keys_v1.txt character_type: ch max_text_length: 1000 infer_mode: False use_space_char: True save_res_path: ./output/rec/predicts_r34_vd_none_bilstm_ctc.txt
Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Cosine learning_rate: 0.001 regularizer: name: 'L2' factor: 0.00004
Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: ResNet layers: 34 Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 256 Head: name: CTCHead fc_decay: 0.00004
Loss: name: CTCLoss
PostProcess: name: CTCLabelDecode
Metric: name: RecMetric main_indicator: acc
Train: dataset: name: SimpleDataSet data_dir: D:\java\PaddleOCR-train\trains\recTrain label_file_list: D:\java\PaddleOCR-train\trains\recTrain\Label.txt transforms:
Eval: dataset: name: SimpleDataSet data_dir: D:\java\PaddleOCR-train\trains\recTrain label_file_list: D:\java\PaddleOCR-train\trains\recTrain\Label-test.txt transforms:
其来源是我的预训练模型的对应yum文件,我修改了部分数据后开始运行15条数据集的训练 以下是运行的print:
[2023/05/20 11:18:50] ppocr INFO: Architecture : [2023/05/20 11:18:50] ppocr INFO: Backbone : [2023/05/20 11:18:50] ppocr INFO: layers : 34 [2023/05/20 11:18:50] ppocr INFO: name : ResNet [2023/05/20 11:18:50] ppocr INFO: Head : [2023/05/20 11:18:50] ppocr INFO: fc_decay : 4e-05 [2023/05/20 11:18:50] ppocr INFO: name : CTCHead [2023/05/20 11:18:50] ppocr INFO: Neck : [2023/05/20 11:18:50] ppocr INFO: encoder_type : rnn [2023/05/20 11:18:50] ppocr INFO: hidden_size : 256 [2023/05/20 11:18:50] ppocr INFO: name : SequenceEncoder [2023/05/20 11:18:50] ppocr INFO: Transform : None [2023/05/20 11:18:50] ppocr INFO: algorithm : CRNN [2023/05/20 11:18:50] ppocr INFO: model_type : rec [2023/05/20 11:18:50] ppocr INFO: Eval : [2023/05/20 11:18:50] ppocr INFO: dataset : [2023/05/20 11:18:50] ppocr INFO: data_dir : D:\java\PaddleOCR-train\trains\recTrain [2023/05/20 11:18:50] ppocr INFO: label_file_list : D:\java\PaddleOCR-train\trains\recTrain\Label-test.txt [2023/05/20 11:18:50] ppocr INFO: name : SimpleDataSet [2023/05/20 11:18:50] ppocr INFO: transforms : [2023/05/20 11:18:50] ppocr INFO: DecodeImage : [2023/05/20 11:18:50] ppocr INFO: channel_first : False [2023/05/20 11:18:50] ppocr INFO: img_mode : BGR [2023/05/20 11:18:50] ppocr INFO: CTCLabelEncode : None [2023/05/20 11:18:50] ppocr INFO: RecResizeImg : [2023/05/20 11:18:50] ppocr INFO: image_shape : [3, 32, 320] [2023/05/20 11:18:50] ppocr INFO: KeepKeys : [2023/05/20 11:18:50] ppocr INFO: keep_keys : ['image', 'label', 'length'] [2023/05/20 11:18:50] ppocr INFO: loader : [2023/05/20 11:18:50] ppocr INFO: batch_size_per_card : 1 [2023/05/20 11:18:50] ppocr INFO: drop_last : False [2023/05/20 11:18:50] ppocr INFO: num_workers : 0 [2023/05/20 11:18:50] ppocr INFO: shuffle : False [2023/05/20 11:18:50] ppocr INFO: Global : [2023/05/20 11:18:50] ppocr INFO: cal_metric_during_train : True [2023/05/20 11:18:50] ppocr INFO: character_dict_path : D:\java\PaddleOCR-train\ppocr\utils\ppocr_keys_v1.txt [2023/05/20 11:18:50] ppocr INFO: character_type : ch [2023/05/20 11:18:50] ppocr INFO: checkpoints : None [2023/05/20 11:18:50] ppocr INFO: distributed : False [2023/05/20 11:18:50] ppocr INFO: epoch_num : 100 [2023/05/20 11:18:50] ppocr INFO: eval_batch_step : [0, 2000] [2023/05/20 11:18:50] ppocr INFO: infer_img : doc/imgs_words/ch/word_1.jpg [2023/05/20 11:18:50] ppocr INFO: infer_mode : False [2023/05/20 11:18:50] ppocr INFO: log_smooth_window : 20 [2023/05/20 11:18:50] ppocr INFO: max_text_length : 1000 [2023/05/20 11:18:50] ppocr INFO: pretrained_model : D:\java\PaddleOCR-train\trains\chPPocr\best_accuracy [2023/05/20 11:18:50] ppocr INFO: print_batch_step : 1 [2023/05/20 11:18:50] ppocr INFO: save_epoch_step : 3 [2023/05/20 11:18:50] ppocr INFO: save_inference_dir : None [2023/05/20 11:18:50] ppocr INFO: save_model_dir : ./output/rec_chinese_common_v2.0 [2023/05/20 11:18:50] ppocr INFO: save_res_path : ./output/rec/predicts_r34_vd_none_bilstm_ctc.txt [2023/05/20 11:18:50] ppocr INFO: use_gpu : False [2023/05/20 11:18:50] ppocr INFO: use_space_char : True [2023/05/20 11:18:50] ppocr INFO: use_visualdl : False [2023/05/20 11:18:50] ppocr INFO: Loss : [2023/05/20 11:18:50] ppocr INFO: name : CTCLoss [2023/05/20 11:18:50] ppocr INFO: Metric : [2023/05/20 11:18:50] ppocr INFO: main_indicator : acc [2023/05/20 11:18:50] ppocr INFO: name : RecMetric [2023/05/20 11:18:50] ppocr INFO: Optimizer : [2023/05/20 11:18:50] ppocr INFO: beta1 : 0.9 [2023/05/20 11:18:50] ppocr INFO: beta2 : 0.999 [2023/05/20 11:18:50] ppocr INFO: lr : [2023/05/20 11:18:50] ppocr INFO: learning_rate : 0.001 [2023/05/20 11:18:50] ppocr INFO: name : Cosine [2023/05/20 11:18:50] ppocr INFO: name : Adam [2023/05/20 11:18:50] ppocr INFO: regularizer : [2023/05/20 11:18:50] ppocr INFO: factor : 4e-05 [2023/05/20 11:18:50] ppocr INFO: name : L2 [2023/05/20 11:18:50] ppocr INFO: PostProcess : [2023/05/20 11:18:50] ppocr INFO: name : CTCLabelDecode [2023/05/20 11:18:50] ppocr INFO: Train : [2023/05/20 11:18:50] ppocr INFO: dataset : [2023/05/20 11:18:50] ppocr INFO: data_dir : D:\java\PaddleOCR-train\trains\recTrain [2023/05/20 11:18:50] ppocr INFO: label_file_list : D:\java\PaddleOCR-train\trains\recTrain\Label.txt [2023/05/20 11:18:50] ppocr INFO: name : SimpleDataSet [2023/05/20 11:18:50] ppocr INFO: transforms : [2023/05/20 11:18:50] ppocr INFO: DecodeImage : [2023/05/20 11:18:50] ppocr INFO: channel_first : False [2023/05/20 11:18:50] ppocr INFO: img_mode : BGR [2023/05/20 11:18:50] ppocr INFO: RecAug : None [2023/05/20 11:18:50] ppocr INFO: CTCLabelEncode : None [2023/05/20 11:18:50] ppocr INFO: RecResizeImg : [2023/05/20 11:18:50] ppocr INFO: image_shape : [3, 32, 320] [2023/05/20 11:18:50] ppocr INFO: KeepKeys : [2023/05/20 11:18:50] ppocr INFO: keep_keys : ['image', 'label', 'length'] [2023/05/20 11:18:50] ppocr INFO: loader : [2023/05/20 11:18:50] ppocr INFO: batch_size_per_card : 1 [2023/05/20 11:18:50] ppocr INFO: drop_last : True [2023/05/20 11:18:50] ppocr INFO: num_workers : 0 [2023/05/20 11:18:50] ppocr INFO: shuffle : True [2023/05/20 11:18:50] ppocr INFO: profiler_options : None [2023/05/20 11:18:50] ppocr INFO: train with paddle 2.4.2 and device Place(cpu) [2023/05/20 11:18:50] ppocr INFO: Initialize indexs of datasets:D:\java\PaddleOCR-train\trains\recTrain\Label.txt [2023/05/20 11:18:50] ppocr INFO: Initialize indexs of datasets:D:\java\PaddleOCR-train\trains\recTrain\Label-test.txt [2023/05/20 11:18:52] ppocr INFO: train dataloader has 15 iters [2023/05/20 11:18:52] ppocr INFO: valid dataloader has 15 iters [2023/05/20 11:18:53] ppocr INFO: load pretrain successful from D:\java\PaddleOCR-train\trains\chPPocr\best_accuracy [2023/05/20 11:18:53] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations [2023/05/20 11:19:02] ppocr INFO: epoch: [1/100], global_step: 1, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.021515, loss: 0.000000, avg_reader_cost: 0.04919 s, avg_batch_cost: 9.37230 s, avg_samples: 1.0, ips: 0.10670 samples/s, eta: 3:54:09 [2023/05/20 11:19:20] ppocr INFO: epoch: [1/100], global_step: 2, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.022527, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 18.02700 s, avg_samples: 1.0, ips: 0.05547 samples/s, eta: 5:42:02 [2023/05/20 11:19:29] ppocr INFO: epoch: [1/100], global_step: 3, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.021515, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 8.96189 s, avg_samples: 1.0, ips: 0.11158 samples/s, eta: 5:02:24 [2023/05/20 11:19:40] ppocr INFO: epoch: [1/100], global_step: 4, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.022527, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 10.60504 s, avg_samples: 1.0, ips: 0.09429 samples/s, eta: 4:52:45 [2023/05/20 11:19:50] ppocr INFO: epoch: [1/100], global_step: 5, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023539, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 10.06867 s, avg_samples: 1.0, ips: 0.09932 samples/s, eta: 4:44:13 [2023/05/20 11:20:08] ppocr INFO: epoch: [1/100], global_step: 6, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023539, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 17.39051 s, avg_samples: 1.0, ips: 0.05750 samples/s, eta: 5:08:51 [2023/05/20 11:20:27] ppocr INFO: epoch: [1/100], global_step: 7, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023539, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 19.16978 s, avg_samples: 1.0, ips: 0.05217 samples/s, eta: 5:32:42 [2023/05/20 11:20:45] ppocr INFO: epoch: [1/100], global_step: 8, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023539, loss: 0.000000, avg_reader_cost: 0.00098 s, avg_batch_cost: 18.30316 s, avg_samples: 1.0, ips: 0.05464 samples/s, eta: 5:47:49 [2023/05/20 11:20:55] ppocr INFO: epoch: [1/100], global_step: 9, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023539, loss: 0.000000, avg_reader_cost: 0.00200 s, avg_batch_cost: 9.80303 s, avg_samples: 1.0, ips: 0.10201 samples/s, eta: 5:36:01 [2023/05/20 11:21:05] ppocr INFO: epoch: [1/100], global_step: 10, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023269, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 10.09848 s, avg_samples: 1.0, ips: 0.09902 samples/s, eta: 5:27:18 [2023/05/20 11:21:15] ppocr INFO: epoch: [1/100], global_step: 11, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023539, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 9.84281 s, avg_samples: 1.0, ips: 0.10160 samples/s, eta: 5:19:33 [2023/05/20 11:21:26] ppocr INFO: epoch: [1/100], global_step: 12, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023269, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 11.58496 s, avg_samples: 1.0, ips: 0.08632 samples/s, eta: 5:16:40 [2023/05/20 11:21:44] ppocr INFO: epoch: [1/100], global_step: 13, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023539, loss: 0.000000, avg_reader_cost: 0.00000 s, avg_batch_cost: 17.59535 s, avg_samples: 1.0, ips: 0.05683 samples/s, eta: 5:25:39 [2023/05/20 11:22:02] ppocr INFO: epoch: [1/100], global_step: 14, lr: 0.001000, acc: 0.000000, norm_edit_dis: 0.023269, loss: 0.000000, avg_reader_cost: 0.00097 s, avg_batch_cost: 18.52011 s, avg_samples: 1.0, ips: 0.05400 samples/s, eta: 5:34:57`我尝试过等待,直到其运行到12/100次后仍然还是0.000000 这是我的label文件 是通过paddleocr自带的标注程序生成的
imgs/lgl.jpg [{"transcription": "刘光烈", "points": [[12, 9], [154, 9], [154, 66], [12, 66]], "difficult": false}] imgs/lgw.jpg [{"transcription": "罗国伟", "points": [[8, 0], [168, 5], [170, 75], [11, 76]], "difficult": false}] imgs/pys.jpg [{"transcription": "番禺所", "points": [[7, 9], [111, 15], [109, 56], [5, 50]], "difficult": false}] imgs/xmjl.jpg [{"transcription": "项目经理", "points": [[2, 5], [131, 2], [139, 44], [2, 49]], "difficult": false}] imgs/ywjswgn.jpg [{"transcription": "遗忘就是我给你", "points": [[9, 20], [401, 24], [400, 89], [1, 87]], "difficult": false}] imgs/zhdjn.jpg [{"transcription": "最好的纪念", "points": [[21, 20], [283, 7], [287, 93], [24, 105]], "difficult": false}] imgs/ztdz.jpg [{"transcription": "众通电子", "points": [[6, 7], [151, 7], [151, 52], [6, 52]], "difficult": false}] imgs/zy.jpg [{"transcription": "张瑜", "points": [[7, 0], [110, 0], [110, 57], [7, 57]], "difficult": false}] imgs/zzsj.jpg [{"transcription": "总支书记", "points": [[10, 7], [204, 7], [204, 70], [10, 70]], "difficult": false}] imgs/gzs.png [{"transcription": "广州所", "points": [[2, 6], [132, 6], [132, 59], [2, 59]], "difficult": false}] imgs/gzzx.jpg [{"transcription": "广州中心", "points": [[15, 9], [182, 9], [182, 65], [15, 65]], "difficult": false}] imgs/gzzxx.jpg [{"transcription": "广州中心", "points": [[11, 8], [197, 8], [197, 73], [11, 73]], "difficult": false}] imgs/hdrbxyzj.jpg [{"transcription": "很多人不需要再见", "points": [[17, 3], [401, 3], [401, 68], [17, 68]], "difficult": false}] imgs/sj.jpg [{"transcription": "书记", "points": [[25, 10], [94, 10], [94, 42], [25, 42]], "difficult": false}] imgs/ywzslgey.jpg [{"transcription": "因为只是路过而已", "points": [[20, 14], [401, 14], [401, 76], [20, 76]], "difficult": false}]
test文件也是如上格式 并不明白是什么原因 我也尝试过使用icdar2015, 但是acc也是0.000000,请问大家有遇到过这种情况吗