PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
43.92k stars 7.8k forks source link

自定义字典报:root WARNING: The shape of model params Student.head.fc2.bias [21333] not matched with loaded params Student.head.fc2.bias [6625] ! #5655

Closed dengmingD closed 2 years ago

dengmingD commented 2 years ago

新字典21333个字符 旧字典6625个字符 训练报warring: [2022/03/08 08:34:10] root WARNING: The shape of model params Teacher.head.fc2.weight [96, 21333] not matched with loaded params Teacher.head.fc2.weight [96, 6625] ! [2022/03/08 08:34:10] root WARNING: The shape of model params Teacher.head.fc2.bias [21333] not matched with loaded params Teacher.head.fc2.bias [6625] ! [2022/03/08 08:34:10] root WARNING: The shape of model params Student.head.fc2.weight [96, 21333] not matched with loaded params Student.head.fc2.weight [96, 6625] ! [2022/03/08 08:34:10] root WARNING: The shape of model params Student.head.fc2.bias [21333] not matched with loaded params Student.head.fc2.bias [6625] !

虽然是warring但可以训练,但会使loss为Nanxxx 以下是全部输出信息: [2022/03/08 08:33:47] root INFO: Architecture : [2022/03/08 08:33:47] root INFO: Models : [2022/03/08 08:33:47] root INFO: Student : [2022/03/08 08:33:47] root INFO: Backbone : [2022/03/08 08:33:47] root INFO: name : MobileNetV1Enhance [2022/03/08 08:33:47] root INFO: scale : 0.5 [2022/03/08 08:33:47] root INFO: Head : [2022/03/08 08:33:47] root INFO: fc_decay : 2e-05 [2022/03/08 08:33:47] root INFO: mid_channels : 96 [2022/03/08 08:33:47] root INFO: name : CTCHead [2022/03/08 08:33:47] root INFO: Neck : [2022/03/08 08:33:47] root INFO: encoder_type : rnn [2022/03/08 08:33:47] root INFO: hidden_size : 64 [2022/03/08 08:33:47] root INFO: name : SequenceEncoder [2022/03/08 08:33:47] root INFO: Transform : None [2022/03/08 08:33:47] root INFO: algorithm : CRNN [2022/03/08 08:33:47] root INFO: freeze_params : False [2022/03/08 08:33:47] root INFO: model_type : rec [2022/03/08 08:33:47] root INFO: pretrained : None [2022/03/08 08:33:47] root INFO: return_all_feats : True [2022/03/08 08:33:47] root INFO: Teacher : [2022/03/08 08:33:47] root INFO: Backbone : [2022/03/08 08:33:47] root INFO: name : MobileNetV1Enhance [2022/03/08 08:33:47] root INFO: scale : 0.5 [2022/03/08 08:33:47] root INFO: Head : [2022/03/08 08:33:47] root INFO: fc_decay : 2e-05 [2022/03/08 08:33:47] root INFO: mid_channels : 96 [2022/03/08 08:33:47] root INFO: name : CTCHead [2022/03/08 08:33:47] root INFO: Neck : [2022/03/08 08:33:47] root INFO: encoder_type : rnn [2022/03/08 08:33:47] root INFO: hidden_size : 64 [2022/03/08 08:33:47] root INFO: name : SequenceEncoder [2022/03/08 08:33:47] root INFO: Transform : None [2022/03/08 08:33:47] root INFO: algorithm : CRNN [2022/03/08 08:33:47] root INFO: freeze_params : False [2022/03/08 08:33:47] root INFO: model_type : rec [2022/03/08 08:33:47] root INFO: pretrained : None [2022/03/08 08:33:47] root INFO: return_all_feats : True [2022/03/08 08:33:47] root INFO: algorithm : Distillation [2022/03/08 08:33:47] root INFO: model_type : rec [2022/03/08 08:33:47] root INFO: name : DistillationModel [2022/03/08 08:33:47] root INFO: Eval : [2022/03/08 08:33:47] root INFO: dataset : [2022/03/08 08:33:47] root INFO: data_dir : d:/ocr/test_data [2022/03/08 08:33:47] root INFO: label_file_list : ['d:/ocr/test_lable.txt'] [2022/03/08 08:33:47] root INFO: name : SimpleDataSet [2022/03/08 08:33:47] root INFO: transforms : [2022/03/08 08:33:47] root INFO: DecodeImage : [2022/03/08 08:33:47] root INFO: channel_first : False [2022/03/08 08:33:47] root INFO: img_mode : BGR [2022/03/08 08:33:47] root INFO: CTCLabelEncode : None [2022/03/08 08:33:47] root INFO: RecResizeImg : [2022/03/08 08:33:47] root INFO: image_shape : [3, 32, 320] [2022/03/08 08:33:47] root INFO: KeepKeys : [2022/03/08 08:33:47] root INFO: keep_keys : ['image', 'label', 'length'] [2022/03/08 08:33:47] root INFO: loader : [2022/03/08 08:33:47] root INFO: batch_size_per_card : 64 [2022/03/08 08:33:47] root INFO: drop_last : False [2022/03/08 08:33:47] root INFO: num_workers : 1 [2022/03/08 08:33:47] root INFO: shuffle : False [2022/03/08 08:33:47] root INFO: Global : [2022/03/08 08:33:47] root INFO: cal_metric_during_train : True [2022/03/08 08:33:47] root INFO: character_dict_path : ppocr/utils/ppocr_keys_v2.txt [2022/03/08 08:33:47] root INFO: character_type : ch [2022/03/08 08:33:47] root INFO: checkpoints : None [2022/03/08 08:33:47] root INFO: debug : False [2022/03/08 08:33:47] root INFO: distributed : False [2022/03/08 08:33:47] root INFO: epoch_num : 500 [2022/03/08 08:33:47] root INFO: eval_batch_step : [0, 200000] [2022/03/08 08:33:47] root INFO: infer_img : doc/imgs_words/ch/word_1.jpg [2022/03/08 08:33:47] root INFO: infer_mode : False [2022/03/08 08:33:47] root INFO: log_smooth_window : 100 [2022/03/08 08:33:47] root INFO: max_text_length : 25 [2022/03/08 08:33:47] root INFO: pretrained_model : ./train_model/ch_PP-OCRv2_rec_train/best_accuracy [2022/03/08 08:33:47] root INFO: print_batch_step : 100 [2022/03/08 08:33:47] root INFO: save_epoch_step : 1 [2022/03/08 08:33:47] root INFO: save_inference_dir : None [2022/03/08 08:33:47] root INFO: save_model_dir : ./output/rec_pp-OCRv2_distillation_1_5_2k [2022/03/08 08:33:47] root INFO: save_res_path : ./output/rec/predicts_pp-OCRv2_distillation.txt [2022/03/08 08:33:47] root INFO: use_gpu : True [2022/03/08 08:33:47] root INFO: use_space_char : False [2022/03/08 08:33:47] root INFO: use_visualdl : False [2022/03/08 08:33:47] root INFO: Loss : [2022/03/08 08:33:47] root INFO: loss_config_list : [2022/03/08 08:33:47] root INFO: DistillationCTCLoss : [2022/03/08 08:33:47] root INFO: key : head_out [2022/03/08 08:33:47] root INFO: model_name_list : ['Student', 'Teacher'] [2022/03/08 08:33:47] root INFO: weight : 1.0 [2022/03/08 08:33:47] root INFO: DistillationDMLLoss : [2022/03/08 08:33:47] root INFO: act : softmax [2022/03/08 08:33:47] root INFO: key : head_out [2022/03/08 08:33:47] root INFO: model_name_pairs : [['Student', 'Teacher']] [2022/03/08 08:33:47] root INFO: use_log : True [2022/03/08 08:33:47] root INFO: weight : 1.0 [2022/03/08 08:33:47] root INFO: DistillationDistanceLoss : [2022/03/08 08:33:47] root INFO: key : backbone_out [2022/03/08 08:33:47] root INFO: mode : l2 [2022/03/08 08:33:47] root INFO: model_name_pairs : [['Student', 'Teacher']] [2022/03/08 08:33:47] root INFO: weight : 1.0 [2022/03/08 08:33:47] root INFO: name : CombinedLoss [2022/03/08 08:33:47] root INFO: Metric : [2022/03/08 08:33:47] root INFO: base_metric_name : RecMetric [2022/03/08 08:33:47] root INFO: key : Student [2022/03/08 08:33:47] root INFO: main_indicator : acc [2022/03/08 08:33:47] root INFO: name : DistillationMetric [2022/03/08 08:33:47] root INFO: Optimizer : [2022/03/08 08:33:47] root INFO: beta1 : 0.9 [2022/03/08 08:33:47] root INFO: beta2 : 0.999 [2022/03/08 08:33:47] root INFO: lr : [2022/03/08 08:33:47] root INFO: decay_epochs : [700, 800] [2022/03/08 08:33:47] root INFO: name : Piecewise [2022/03/08 08:33:47] root INFO: values : [0.0001, 1e-05] [2022/03/08 08:33:47] root INFO: warmup_epoch : 5 [2022/03/08 08:33:47] root INFO: name : Adam [2022/03/08 08:33:47] root INFO: regularizer : [2022/03/08 08:33:47] root INFO: factor : 2e-06 [2022/03/08 08:33:47] root INFO: name : L2 [2022/03/08 08:33:47] root INFO: PostProcess : [2022/03/08 08:33:47] root INFO: key : head_out [2022/03/08 08:33:47] root INFO: model_name : ['Student', 'Teacher'] [2022/03/08 08:33:47] root INFO: name : DistillationCTCLabelDecode [2022/03/08 08:33:47] root INFO: Train : [2022/03/08 08:33:47] root INFO: dataset : [2022/03/08 08:33:47] root INFO: data_dir : d:/ocr/train_data_1_5_2k [2022/03/08 08:33:47] root INFO: label_file_list : ['d:/ocr/train_data_1_5_2k/lable.txt'] [2022/03/08 08:33:47] root INFO: name : SimpleDataSet [2022/03/08 08:33:47] root INFO: transforms : [2022/03/08 08:33:47] root INFO: DecodeImage : [2022/03/08 08:33:47] root INFO: channel_first : False [2022/03/08 08:33:47] root INFO: img_mode : BGR [2022/03/08 08:33:47] root INFO: RecAug : None [2022/03/08 08:33:47] root INFO: CTCLabelEncode : None [2022/03/08 08:33:47] root INFO: RecResizeImg : [2022/03/08 08:33:47] root INFO: image_shape : [3, 32, 320] [2022/03/08 08:33:47] root INFO: KeepKeys : [2022/03/08 08:33:47] root INFO: keep_keys : ['image', 'label', 'length'] [2022/03/08 08:33:47] root INFO: loader : [2022/03/08 08:33:47] root INFO: batch_size_per_card : 64 [2022/03/08 08:33:47] root INFO: drop_last : True [2022/03/08 08:33:47] root INFO: num_sections : 1 [2022/03/08 08:33:47] root INFO: num_workers : 1 [2022/03/08 08:33:47] root INFO: shuffle : True [2022/03/08 08:33:47] root INFO: profiler_options : None [2022/03/08 08:33:47] root INFO: train with paddle 2.2.1 and device CUDAPlace(0) [2022/03/08 08:33:47] root INFO: Initialize indexs of datasets:['d:/ocr/train_data_1_5_2k/lable.txt'] [2022/03/08 08:34:07] root INFO: Initialize indexs of datasets:['d:/ocr/test_lable.txt'] [2022/03/08 08:34:10] root WARNING: The shape of model params Teacher.head.fc2.weight [96, 21333] not matched with loaded params Teacher.head.fc2.weight [96, 6625] ! [2022/03/08 08:34:10] root WARNING: The shape of model params Teacher.head.fc2.bias [21333] not matched with loaded params Teacher.head.fc2.bias [6625] ! [2022/03/08 08:34:10] root WARNING: The shape of model params Student.head.fc2.weight [96, 21333] not matched with loaded params Student.head.fc2.weight [96, 6625] ! [2022/03/08 08:34:10] root WARNING: The shape of model params Student.head.fc2.bias [21333] not matched with loaded params Student.head.fc2.bias [6625] ! [2022/03/08 08:34:10] root INFO: load pretrain successful from ./train_model/ch_PP-OCRv2_rec_train/best_accuracy [2022/03/08 08:34:10] root INFO: train dataloader has 232016 iters [2022/03/08 08:34:10] root INFO: valid dataloader has 1123 iters [2022/03/08 08:34:10] root INFO: During the training process, after the 0th iteration, an evaluation is run every 200000 iterations [2022/03/08 08:34:10] root INFO: Initialize indexs of datasets:['d:/ocr/train_data_1_5_2k/lable.txt']

dengmingD commented 2 years ago

查了一下,应该是c++代码写的6625,这是否意味着自定义字典字符数量只能是6625?

dengmingD commented 2 years ago

config如下:

Global: debug: false use_gpu: true epoch_num: 500 log_smooth_window: 100 print_batch_step: 100 save_model_dir: ./output/rec_pp-OCRv2_distillation_1_5_2k save_epoch_step: 1 eval_batch_step: [0, 200000] cal_metric_during_train: true pretrained_model: ./train_model/ch_PP-OCRv2_rec_train/best_accuracy checkpoints: save_inference_dir: use_visualdl: false infer_img: doc/imgs_words/ch/word_1.jpg character_dict_path: ppocr/utils/ppocr_keys_v2.txt character_type: ch max_text_length: 25 infer_mode: false use_space_char: False distributed: true save_res_path: ./output/rec/predicts_pp-OCRv2_distillation.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Piecewise decay_epochs : [700, 800] values : [0.0001, 0.00001] warmup_epoch: 5 regularizer: name: L2 factor: 2.0e-06

Architecture: model_type: &model_type "rec" name: DistillationModel algorithm: Distillation Models: Teacher: pretrained: freeze_params: false return_all_feats: true model_type: model_type algorithm: CRNN Transform: Backbone: name: MobileNetV1Enhance scale: 0.5 Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 64 Head: name: CTCHead mid_channels: 96 fc_decay: 0.00002 Student: pretrained: freeze_params: false return_all_feats: true model_type: model_type algorithm: CRNN Transform: Backbone: name: MobileNetV1Enhance scale: 0.5 Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 64 Head: name: CTCHead mid_channels: 96 fc_decay: 0.00002

Loss: name: CombinedLoss loss_config_list:

PostProcess: name: DistillationCTCLabelDecode model_name: ["Student", "Teacher"] key: head_out

Metric: name: DistillationMetric base_metric_name: RecMetric main_indicator: acc key: "Student"

Train: dataset: name: SimpleDataSet data_dir: d:/ocr/train_data_1_5_2k label_file_list:

dengmingD commented 2 years ago

The shape of model params Student.head.fc2.bias [21333] not matched with loaded params Student.head.fc2.bias [6625] ! 训练到一定时间后loss:Nanxxx,是不是这个原因

dengmingD commented 2 years ago

试了一下ch_PP-OCRv2_rec_enhanced_ctc_loss.yml,这个可以,但不知道效果怎么样,有训练过的朋友 吗,比知识蒸馏好吗?方便的话告知一下,谢谢

dengmingD commented 2 years ago

原因找到了,是预训模型的问题,预训模型只有6625个字符把有shape与新字典的shape不一样 新字典训练不要用官方的预训模型

Peins commented 1 year ago

我也是训练到一定时间后loss:Nanxxx,第28个epoch最后一点时候

xuxiansheng2018 commented 1 year ago

原因找到了,是预训模型的问题,预训模型只有6625个字符把有形状与新字典的形状不一样 新字典训练不要用官方的预训模型

你好 我自己添加了一些字,不可以使用官方的预训练模型吗,那应该怎么做?

thds9 commented 3 months ago

请问一下,是在训练的时候报的问题还是训练完了模型导出inference的报的问题啊,我用预训练模型去训练,可以正常训练完,但是导出的时候只有两个文件,缺失inference.params文件