使用RARE模型进行训练收敛速度很慢，并且预测结果精度很低

kano201 commented 1 year ago

使用RARE模型进行蒙古文识别的训练配置文件只修改了

最大字符长度为200
且使用空格
输入图像的尺寸为[3,32,2000] 起初无法训练，我将rec_att_head.py文件中的AttentionLSTM与AttentionHead的class中forward函数的batch_max_length参数也改为200后和正常运行，但是收敛速度很慢，且最后识别精度很低相同的数据和配置在CRNN模型训练中可以正常使用完整的配置信息 `Global: use_gpu: True epoch_num: 72 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/rec/mn_rare_real/ save_epoch_step: 3
evaluation is run every 5000 iterations after the 4000th iteration

eval_batch_step: [0, 2000] cal_metric_during_train: True pretrained_model: checkpoints: save_inference_dir: use_visualdl: False infer_img:

for data or label process

character_dict_path: ./train_data/dict_test.txt max_text_length: 200 infer_mode: False use_space_char: True save_res_path: ./output/rec/predicts_rare_real.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: learning_rate: 0.0005 regularizer: name: 'L2' factor: 0.00001

Architecture: model_type: rec algorithm: RARE Transform: name: TPS num_fiducial: 20 loc_lr: 0.1 model_name: small Backbone: name: MobileNetV3 scale: 0.5 model_name: large Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 96 Head: name: AttentionHead
hidden_size: 96

Loss: name: AttentionLoss

PostProcess: name: AttnLabelDecode

Metric: name: RecMetric main_indicator: acc

Train: dataset: name: SimpleDataSet data_dir: ./train_data/ label_file_list:

./train_data/small_size_real_train_checked.txt transforms:
- DecodeImage: # load image img_mode: BGR channel_first: False
- AttnLabelEncode: # Class handling label
- RecResizeImg: image_shape: [3, 32, 2000]
- KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: True batch_size_per_card: 32 drop_last: True num_workers: 8

Eval: dataset: name: SimpleDataSet data_dir: ./train_data/ label_file_list:

./train_data/mn_eval_checked.txt transforms:
- DecodeImage: # load image img_mode: BGR channel_first: False
- AttnLabelEncode: # Class handling label
- RecResizeImg: image_shape: [3, 32, 2000]
- KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 32 num_workers: 1 `

an1018 commented 1 year ago

图片宽度改为2000，尺寸变大，训练速度就会变慢，效果查，可以排查下拉伸到这个尺寸 [3, 32, 2000]，图片是不是太模糊了，或者infer过程尺寸有没有对应修改

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

PaddlePaddle / PaddleOCR

使用RARE模型进行训练收敛速度很慢，并且预测结果精度很低 #9535

evaluation is run every 5000 iterations after the 4000th iteration

for data or label process