decode in rec_postprocess.py samples out of range indexes

HSILA commented 2 years ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：Ubuntu 20.04
版本号/Version：Paddle： 2.2.2 (cpu) PaddleOCR： 2.5.0 & 2.4.0 问题相关组件/Related
运行指令/Command Code： python3 tools/train.py -c my_config.yml
完整报错/Complete Error Message：
```
Traceback (most recent call last):
```

File "tools/train.py", line 188, in

main(config, device, logger, vdl_writer)

File "tools/train.py", line 161, in main

program.train(config, train_dataloader, valid_dataloader, device, model,

File "**/PaddleOCR/tools/program.py", line 285, in train

post_result = post_process_class(preds, batch[1])

File "**/PaddleOCR/ppocr/postprocess/rec_postprocess.py", line 101, in call

label = self.decode(label)

File "**/PaddleOCR/ppocr/postprocess/rec_postprocess.py", line 64, in decode

char_list = [

File "**/PaddleOCR/ppocr/postprocess/rec_postprocess.py", line 65, in

self.character[text_id]

IndexError: list index out of range


I'm training on Persian alphabet. You can find my dictionary [here](https://pastebin.pl/view/181934dc).
The architecture is:
- TPS: None
-  Backbone: MobileNetV3
- Neck: BiLSTM
- Head: CTCHead
And `max_text_length` is set to 30.

I have 35 characters (including space and setting `use_space_char` to false, this is not the issue since I've tested various possibilities for space), when I debug the `rec_postprocess.py`, we can see that everything's okay and there's 36 characters (one with index 0 for blank):
![Screenshot from 2022-05-11 14-22-04](https://user-images.githubusercontent.com/40033849/167827090-a328e6d8-f20f-4879-82e2-f40ac8f51d91.png)
But code wants to access `text_id` 36 which will cause this error. I have tested this with both ppocr v2.5.0 and v2.4.0 and both of them show same behavior. I also face this exact issue when predicting with exported inference model, but for now let's focus on training. 
![Screenshot from 2022-05-11 14-24-03](https://user-images.githubusercontent.com/40033849/167829260-33b5f03a-66f7-47bd-bee6-76a063e0b9fa.png)

And also it's worth mentioning when I use attention instead of CTC, everything is fine.

tink2123 commented 2 years ago

For inference model , The dictionary used at prediction needs to be consistent with the training.

For training, you can check following two dicts are the same length

https://github.com/PaddlePaddle/PaddleOCR/blob/791712e274fa6410b91fcb19efaf8e9db6d89862/ppocr/data/imaug/label_ops.py#L129

https://github.com/PaddlePaddle/PaddleOCR/blob/791712e274fa6410b91fcb19efaf8e9db6d89862/ppocr/postprocess/rec_postprocess.py#L46

HSILA commented 2 years ago

@tink2123 I checked, the two lists were the same length for training. The problem was that label encoder and decoder in the config file didn't match.

Omar3esam commented 2 years ago

Any solution to this problem guys ?

tuyendt-cpu commented 1 year ago

I have the same question, any solutions ? Please help!

MohamedAliBenAlaya commented 1 year ago

Hello I have the same problem how to solve that This my config file Global: debug: false use_gpu: true epoch_num: 300 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/v3_en_mobile save_epoch_step: 3 eval_batch_step: [0, 250] cal_metric_during_train: true pretrained_model: checkpoints: save_inference_dir: use_visualdl: false infer_img: doc/imgs_words/ch/word_1.jpg character_dict_path: /content/PaddleOCR/ppocr/utils/en_dict.txt max_text_length: &max_text_length 27 infer_mode: false use_space_char: true distributed: true save_res_path: ./output/rec/predicts_ppocrv3_en.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Cosine learning_rate: 0.001 warmup_epoch: 5 regularizer: name: L2 factor: 3.0e-05

Architecture: model_type: rec algorithm: SVTR Transform: Backbone: name: MobileNetV1Enhance scale: 0.5 last_conv_stride: [1, 2] last_pool_type: avg Head: name: MultiHead head_list:

CTCHead: Neck: name: svtr dims: 64 depth: 2 hidden_dims: 120 use_guide: True Head: fc_decay: 0.00001
SARHead: enc_dim: 512 max_text_length: *max_text_length

Loss: name: MultiLoss loss_config_list:

CTCLoss:
SARLoss:

PostProcess:
name: CTCLabelDecode

Metric: name: RecMetric main_indicator: acc ignore_space: False

Train: dataset: name: SimpleDataSet data_dir: /content/dataf/ ext_op_transform_idx: 1 label_file_list:

/content/dataf/train.txt transforms:
DecodeImage: img_mode: BGR channel_first: false
RecConAug: prob: 0.5 ext_data_num: 2 image_shape: [48, 320, 3] max_text_length: *max_text_length
RecAug:
MultiLabelEncode:
RecResizeImg: image_shape: [3, 48, 320]
KeepKeys: keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio loader: shuffle: true batch_size_per_card: 64 drop_last: true num_workers: 2 Eval: dataset: name: SimpleDataSet data_dir: /content/dataf label_file_list:
/content/dataf/test.txt transforms:
DecodeImage: img_mode: BGR channel_first: false
MultiLabelEncode:
RecResizeImg: image_shape: [3, 48, 320]
KeepKeys: keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio loader: shuffle: false drop_last: false batch_size_per_card: 64 num_workers: 2

Issue [2023/03/06 15:30:29] ppocr DEBUG: dt_boxes num : 1, elapse : 0.03576517105102539 [2023/03/06 15:30:29] ppocr DEBUG: cls num : 1, elapse : 0.011165857315063477

IndexError Traceback (most recent call last) in ----> 1 custom_ocr.ocr(img)

4 frames /content/PaddleOCRproject/ppocr/postprocess/rec_postprocess.py in decode(self, text_index, text_prob, is_remove_duplicate) 91 # print(text_index) 92 for text_id in text_index[batch_idx][selection]: ---> 93 print(self.character[text_id]) 94 char_list = [ 95 self.character[text_id]

IndexError: list index out of range

PaddlePaddle / PaddleOCR

decode in rec_postprocess.py samples out of range indexes #6253

Issue [2023/03/06 15:30:29] ppocr DEBUG: dt_boxes num : 1, elapse : 0.03576517105102539 [2023/03/06 15:30:29] ppocr DEBUG: cls num : 1, elapse : 0.011165857315063477