ANDROIDTODO commented 2 years ago

Hi PaddleOCR Teams, 我根据文字识别进行训练时，最后的acc为0，而且评估的结果也为0，我照着示例一步步做的，就改了一下eval_batch_step的值为500.

训练log PS D:\paddle\PaddleOCR> python tools/train.py -c configs/rec/rec_icdar15_train.yml [2022/03/08 15:06:30] root INFO: Architecture : [2022/03/08 15:06:30] root INFO: Backbone : [2022/03/08 15:06:30] root INFO: model_name : large [2022/03/08 15:06:30] root INFO: name : MobileNetV3 [2022/03/08 15:06:30] root INFO: scale : 0.5 [2022/03/08 15:06:30] root INFO: Head : [2022/03/08 15:06:30] root INFO: fc_decay : 0 [2022/03/08 15:06:30] root INFO: name : CTCHead [2022/03/08 15:06:30] root INFO: Neck : [2022/03/08 15:06:30] root INFO: encoder_type : rnn [2022/03/08 15:06:30] root INFO: hidden_size : 96 [2022/03/08 15:06:30] root INFO: name : SequenceEncoder [2022/03/08 15:06:30] root INFO: Transform : None [2022/03/08 15:06:30] root INFO: algorithm : CRNN [2022/03/08 15:06:30] root INFO: model_type : rec [2022/03/08 15:06:30] root INFO: Eval : [2022/03/08 15:06:30] root INFO: dataset : [2022/03/08 15:06:30] root INFO: data_dir : ./train_data/ic15_data [2022/03/08 15:06:30] root INFO: label_file_list : ['./train_data/ic15_data/rec_gt_test.txt'] [2022/03/08 15:06:30] root INFO: name : SimpleDataSet [2022/03/08 15:06:30] root INFO: transforms : [2022/03/08 15:06:30] root INFO: DecodeImage : [2022/03/08 15:06:30] root INFO: channel_first : False [2022/03/08 15:06:30] root INFO: img_mode : BGR [2022/03/08 15:06:30] root INFO: CTCLabelEncode : None [2022/03/08 15:06:30] root INFO: RecResizeImg : [2022/03/08 15:06:30] root INFO: image_shape : [3, 32, 100] [2022/03/08 15:06:30] root INFO: KeepKeys : [2022/03/08 15:06:30] root INFO: keep_keys : ['image', 'label', 'length'] [2022/03/08 15:06:30] root INFO: loader : [2022/03/08 15:06:30] root INFO: batch_size_per_card : 256 [2022/03/08 15:06:30] root INFO: drop_last : False [2022/03/08 15:06:30] root INFO: num_workers : 4 [2022/03/08 15:06:30] root INFO: shuffle : False [2022/03/08 15:06:30] root INFO: use_shared_memory : False [2022/03/08 15:06:30] root INFO: Global : [2022/03/08 15:06:30] root INFO: cal_metric_during_train : True [2022/03/08 15:06:30] root INFO: character_dict_path : ppocr/utils/en_dict.txt [2022/03/08 15:06:30] root INFO: checkpoints : None [2022/03/08 15:06:30] root INFO: debug : False [2022/03/08 15:06:30] root INFO: distributed : False [2022/03/08 15:06:30] root INFO: epoch_num : 72 [2022/03/08 15:06:30] root INFO: eval_batch_step : [0, 300] [2022/03/08 15:06:30] root INFO: infer_img : doc/imgs_words_en/word_10.png [2022/03/08 15:06:30] root INFO: infer_mode : False [2022/03/08 15:06:30] root INFO: log_smooth_window : 20 [2022/03/08 15:06:30] root INFO: max_text_length : 25 [2022/03/08 15:06:30] root INFO: pretrained_model : None [2022/03/08 15:06:30] root INFO: print_batch_step : 10 [2022/03/08 15:06:30] root INFO: save_epoch_step : 3 [2022/03/08 15:06:30] root INFO: save_inference_dir : ./ [2022/03/08 15:06:30] root INFO: save_model_dir : ./output/rec/ic15/ [2022/03/08 15:06:30] root INFO: save_res_path : ./output/rec/predicts_ic15.txt [2022/03/08 15:06:30] root INFO: use_gpu : True [2022/03/08 15:06:30] root INFO: use_space_char : False [2022/03/08 15:06:30] root INFO: use_visualdl : False [2022/03/08 15:06:30] root INFO: Loss : [2022/03/08 15:06:30] root INFO: name : CTCLoss [2022/03/08 15:06:30] root INFO: Metric : [2022/03/08 15:06:30] root INFO: main_indicator : acc [2022/03/08 15:06:30] root INFO: name : RecMetric [2022/03/08 15:06:30] root INFO: Optimizer : [2022/03/08 15:06:30] root INFO: beta1 : 0.9 [2022/03/08 15:06:30] root INFO: beta2 : 0.999 [2022/03/08 15:06:30] root INFO: lr : [2022/03/08 15:06:30] root INFO: learning_rate : 0.0005 [2022/03/08 15:06:30] root INFO: name : Adam [2022/03/08 15:06:30] root INFO: regularizer : [2022/03/08 15:06:30] root INFO: factor : 0 [2022/03/08 15:06:30] root INFO: name : L2 [2022/03/08 15:06:30] root INFO: PostProcess : [2022/03/08 15:06:30] root INFO: name : CTCLabelDecode [2022/03/08 15:06:30] root INFO: Train : [2022/03/08 15:06:30] root INFO: dataset : [2022/03/08 15:06:30] root INFO: data_dir : ./train_data/ic15_data/ [2022/03/08 15:06:30] root INFO: label_file_list : ['./train_data/ic15_data/rec_gt_train.txt'] [2022/03/08 15:06:30] root INFO: name : SimpleDataSet [2022/03/08 15:06:30] root INFO: transforms : [2022/03/08 15:06:30] root INFO: DecodeImage : [2022/03/08 15:06:30] root INFO: channel_first : False [2022/03/08 15:06:30] root INFO: img_mode : BGR [2022/03/08 15:06:30] root INFO: CTCLabelEncode : None [2022/03/08 15:06:30] root INFO: RecResizeImg : [2022/03/08 15:06:30] root INFO: image_shape : [3, 32, 100] [2022/03/08 15:06:30] root INFO: KeepKeys : [2022/03/08 15:06:30] root INFO: keep_keys : ['image', 'label', 'length'] [2022/03/08 15:06:30] root INFO: loader : [2022/03/08 15:06:30] root INFO: batch_size_per_card : 256 [2022/03/08 15:06:30] root INFO: drop_last : True [2022/03/08 15:06:30] root INFO: num_workers : 8 [2022/03/08 15:06:30] root INFO: shuffle : True [2022/03/08 15:06:30] root INFO: use_shared_memory : False [2022/03/08 15:06:30] root INFO: profiler_options : None [2022/03/08 15:06:30] root INFO: train with paddle 2.2.0 and device CUDAPlace(0) [2022/03/08 15:06:30] root INFO: Initialize indexs of datasets:['./train_data/ic15_data/rec_gt_train.txt'] [2022/03/08 15:06:30] root INFO: Initialize indexs of datasets:['./train_data/ic15_data/rec_gt_test.txt'] W0308 15:06:30.286813 10044 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.3, Runtime API Version: 11.2 W0308 15:06:30.296787 10044 device_context.cc:465] device: 0, cuDNN Version: 8.2. INFO:root:If regularizer of a Parameter has been set by 'paddle.ParamAttr' or 'static.WeightNormParamAttr' already. The weight_decay[L2Decay, regularization_coeff=0.000000] in Optimizer will not take effect, and it will only be applied to other Parameters! [2022/03/08 15:06:32] root INFO: train from scratch [2022/03/08 15:06:32] root INFO: train dataloader has 17 iters [2022/03/08 15:06:32] root INFO: valid dataloader has 9 iters [2022/03/08 15:06:32] root INFO: During the training process, after the 0th iteration, an evaluation is run every 300 iterations [2022/03/08 15:06:32] root INFO: Initialize indexs of datasets:['./train_data/ic15_data/rec_gt_train.txt'] [2022/03/08 15:06:35] root INFO: epoch: [1/72], iter: 10, lr: 0.000500, loss: 78.730927, acc: 0.000000, norm_edit_dis: 0.000004, reader_cost: 0.00967 s, batch_cost: 0.26955 s, samples: 2816, ips: 1044.70676 [2022/03/08 15:06:36] root INFO: save model in ./output/rec/ic15/latest ... ... ... batch_cost: 0.09192 s, samples: 1280, ips: 1392.46692 [2022/03/08 15:10:32] root INFO: epoch: [72/72], iter: 1150, lr: 0.000500, loss: 16.212624, acc: 0.000000, norm_edit_dis: 0.171520, reader_cost: 0.00089 s, batch_cost: 0.16428 s, samples: 2560, ips: 1558.28445 [2022/03/08 15:10:33] root INFO: save model in ./output/rec/ic15/latest [2022/03/08 15:10:33] root INFO: save model in ./output/rec/ic15/iter_epoch_72 [2022/03/08 15:10:33] root INFO: best metric, acc: 0.0, norm_edit_dis: 0.10148736461565377, fps: 6356.3933238862355, best_epoch: 57

下面是配置参数

Global: use_gpu: true epoch_num: 72 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/rec/ic15/ save_epoch_step: 3

evaluation is run every 2000 iterations

eval_batch_step: [0, 500] cal_metric_during_train: True pretrained_model: checkpoints: save_inference_dir: ./ use_visualdl: False infer_img: doc/imgs_words_en/word_10.png

for data or label process

character_dict_path: ppocr/utils/en_dict.txt max_text_length: 25 infer_mode: False use_space_char: False save_res_path: ./output/rec/predicts_ic15.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: learning_rate: 0.0005 regularizer: name: 'L2' factor: 0

Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: MobileNetV3 scale: 0.5 model_name: large Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 96 Head: name: CTCHead fc_decay: 0

Loss: name: CTCLoss

PostProcess: name: CTCLabelDecode

Metric: name: RecMetric main_indicator: acc

Train: dataset: name: SimpleDataSet data_dir: ./train_data/ic15_data/ label_file_list: ["./train_data/ic15_data/rec_gt_train.txt"] transforms:

DecodeImage: # load image img_mode: BGR channel_first: False
CTCLabelEncode: # Class handling label
RecResizeImg: image_shape: [3, 32, 100]
KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: True batch_size_per_card: 256 drop_last: True num_workers: 8 use_shared_memory: False

Eval: dataset: name: SimpleDataSet data_dir: ./train_data/ic15_data label_file_list: ["./train_data/ic15_data/rec_gt_test.txt"] transforms:

DecodeImage: # load image img_mode: BGR channel_first: False
CTCLabelEncode: # Class handling label
RecResizeImg: image_shape: [3, 32, 100]
KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 256 num_workers: 4 use_shared_memory: False

dengmingD commented 2 years ago

[2022/03/08 15:06:30] root INFO: pretrained_model : None 官方建议用预训模型

ANDROIDTODO commented 2 years ago

[2022/03/08 15:06:30] root INFO: pretrained_model : None 官方建议用预训模型

非常感谢你的回复，已经帮我成功解决了问题。

qqqqq127 commented 2 years ago

[2022/03/08 15:06:30] root INFO: pretrained_model : None 官方建议用预训模型

请问除了CRNN有预训练模型，别的算法去哪下载呢

dengmingD commented 2 years ago

https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/algorithm_overview.md#2%E6%96%87%E6%9C%AC%E8%AF%86%E5%88%AB%E7%AE%97%E6%B3%95

paddle-bot-old[bot] commented 2 years ago

Since you haven\'t replied for more than 3 months, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于您超过三个月未回复，我们将关闭这个issue/pr。若问题未解决或有后续问题，请随时重新打开（建议先拉取最新代码进行尝试），我们会继续跟进。

PaddlePaddle / PaddleOCR

通过文本识别训练示例训练出来的acc=0 #5659

evaluation is run every 2000 iterations

for data or label process