Closed aspaul20 closed 1 month ago
Can you provide a minimally reproducible demo, including the dataset and configuration file.
Here @GreatV example.zip For the pretrained model you can use default en PPOCRv3 weights, or none at all. The error remains the same.
Any pointers on how to resolve this? Looks like something is going wrong in the add_gasuss_noise
...
Note: Its not the image size, I think it has to do with the label. Just tried with a 5000x500 image and a one word label, it works fine. But if I lengthen the label it throws the error.
@aspaul20 , you may need add more data.
The issue was max_text_length
in the config file. My labels were longer than the maximum number of allowed characters. I increased this parameter and the data loads just fine.
I'm getting the same error as list index out of range
while Training. I have increased the max_text_length
too.
[2024/07/17 07:10:35] ppocr INFO: Architecture :
[2024/07/17 07:10:35] ppocr INFO: Backbone :
[2024/07/17 07:10:35] ppocr INFO: last_conv_stride : [1, 2]
[2024/07/17 07:10:35] ppocr INFO: last_pool_kernel_size : [2, 2]
[2024/07/17 07:10:35] ppocr INFO: last_pool_type : avg
[2024/07/17 07:10:35] ppocr INFO: name : MobileNetV1Enhance
[2024/07/17 07:10:35] ppocr INFO: scale : 0.5
[2024/07/17 07:10:35] ppocr INFO: Head :
[2024/07/17 07:10:35] ppocr INFO: head_list :
[2024/07/17 07:10:35] ppocr INFO: CTCHead :
[2024/07/17 07:10:35] ppocr INFO: Head :
[2024/07/17 07:10:35] ppocr INFO: fc_decay : 1e-05
[2024/07/17 07:10:35] ppocr INFO: Neck :
[2024/07/17 07:10:35] ppocr INFO: depth : 2
[2024/07/17 07:10:35] ppocr INFO: dims : 64
[2024/07/17 07:10:35] ppocr INFO: hidden_dims : 120
[2024/07/17 07:10:35] ppocr INFO: name : svtr
[2024/07/17 07:10:35] ppocr INFO: use_guide : True
[2024/07/17 07:10:35] ppocr INFO: SARHead :
[2024/07/17 07:10:35] ppocr INFO: enc_dim : 512
[2024/07/17 07:10:35] ppocr INFO: max_text_length : 50
[2024/07/17 07:10:35] ppocr INFO: name : MultiHead
[2024/07/17 07:10:35] ppocr INFO: Transform : None
[2024/07/17 07:10:35] ppocr INFO: algorithm : SVTR_LCNet
[2024/07/17 07:10:35] ppocr INFO: model_type : rec
[2024/07/17 07:10:35] ppocr INFO: Eval :
[2024/07/17 07:10:35] ppocr INFO: dataset :
[2024/07/17 07:10:35] ppocr INFO: data_dir : /content/drive/MyDrive/Paddle_OCR/Label_OCR/Train_data/rec/Test
[2024/07/17 07:10:35] ppocr INFO: label_file_list : ['/content/drive/MyDrive/Paddle_OCR/Label_OCR/Train_data/rec/rec_test.txt']
[2024/07/17 07:10:35] ppocr INFO: name : SimpleDataSet
[2024/07/17 07:10:35] ppocr INFO: transforms :
[2024/07/17 07:10:35] ppocr INFO: DecodeImage :
[2024/07/17 07:10:35] ppocr INFO: channel_first : False
[2024/07/17 07:10:35] ppocr INFO: img_mode : BGR
[2024/07/17 07:10:35] ppocr INFO: MultiLabelEncode : None
[2024/07/17 07:10:35] ppocr INFO: RecResizeImg :
[2024/07/17 07:10:35] ppocr INFO: image_shape : [3, 48, 320]
[2024/07/17 07:10:35] ppocr INFO: KeepKeys :
[2024/07/17 07:10:35] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio']
[2024/07/17 07:10:35] ppocr INFO: loader :
[2024/07/17 07:10:35] ppocr INFO: batch_size_per_card : 128
[2024/07/17 07:10:35] ppocr INFO: drop_last : False
[2024/07/17 07:10:35] ppocr INFO: num_workers : 4
[2024/07/17 07:10:35] ppocr INFO: shuffle : False
[2024/07/17 07:10:35] ppocr INFO: Global :
[2024/07/17 07:10:35] ppocr INFO: cal_metric_during_train : True
[2024/07/17 07:10:35] ppocr INFO: character_dict_path : /content/drive/MyDrive/Paddle_OCR/Label_OCR/PaddleOCR-main/ppocr/utils/en_dict.txt
[2024/07/17 07:10:35] ppocr INFO: checkpoints : None
[2024/07/17 07:10:35] ppocr INFO: debug : False
[2024/07/17 07:10:35] ppocr INFO: distributed : False
[2024/07/17 07:10:35] ppocr INFO: epoch_num : 10
[2024/07/17 07:10:35] ppocr INFO: eval_batch_step : [0, 2000]
[2024/07/17 07:10:35] ppocr INFO: infer_img : /content/drive/MyDrive/Paddle_OCR/Label_OCR/PaddleOCR-main/doc/imgs_words/en/word_1.png
[2024/07/17 07:10:35] ppocr INFO: infer_mode : False
[2024/07/17 07:10:35] ppocr INFO: log_smooth_window : 20
[2024/07/17 07:10:35] ppocr INFO: max_text_length : 50
[2024/07/17 07:10:35] ppocr INFO: pretrained_model : /content/drive/MyDrive/Paddle_OCR/Label_OCR/PaddleOCR-main/pretrain_models/en_PP-OCRv3_rec_train/best_accuracy
[2024/07/17 07:10:35] ppocr INFO: print_batch_step : 10
[2024/07/17 07:10:35] ppocr INFO: save_epoch_step : 3
[2024/07/17 07:10:35] ppocr INFO: save_inference_dir : None
[2024/07/17 07:10:35] ppocr INFO: save_model_dir : ./output/v3_en_mobile
[2024/07/17 07:10:35] ppocr INFO: save_res_path : ./output/rec/predicts_ppocrv3_en.txt
[2024/07/17 07:10:35] ppocr INFO: use_gpu : True
[2024/07/17 07:10:35] ppocr INFO: use_space_char : False
[2024/07/17 07:10:35] ppocr INFO: use_visualdl : False
[2024/07/17 07:10:35] ppocr INFO: Loss :
[2024/07/17 07:10:35] ppocr INFO: loss_config_list :
[2024/07/17 07:10:35] ppocr INFO: CTCLoss : None
[2024/07/17 07:10:35] ppocr INFO: SARLoss : None
[2024/07/17 07:10:35] ppocr INFO: name : MultiLoss
[2024/07/17 07:10:35] ppocr INFO: Metric :
[2024/07/17 07:10:35] ppocr INFO: ignore_space : False
[2024/07/17 07:10:35] ppocr INFO: main_indicator : acc
[2024/07/17 07:10:35] ppocr INFO: name : RecMetric
[2024/07/17 07:10:35] ppocr INFO: Optimizer :
[2024/07/17 07:10:35] ppocr INFO: beta1 : 0.9
[2024/07/17 07:10:35] ppocr INFO: beta2 : 0.999
[2024/07/17 07:10:35] ppocr INFO: lr :
[2024/07/17 07:10:35] ppocr INFO: learning_rate : 0.001
[2024/07/17 07:10:35] ppocr INFO: name : Cosine
[2024/07/17 07:10:35] ppocr INFO: warmup_epoch : 5
[2024/07/17 07:10:35] ppocr INFO: name : Adam
[2024/07/17 07:10:35] ppocr INFO: regularizer :
[2024/07/17 07:10:35] ppocr INFO: factor : 3e-05
[2024/07/17 07:10:35] ppocr INFO: name : L2
[2024/07/17 07:10:35] ppocr INFO: PostProcess :
[2024/07/17 07:10:35] ppocr INFO: name : CTCLabelDecode
[2024/07/17 07:10:35] ppocr INFO: Train :
[2024/07/17 07:10:35] ppocr INFO: dataset :
[2024/07/17 07:10:35] ppocr INFO: data_dir : /content/drive/MyDrive/Paddle_OCR/Label_OCR/Train_data/rec/Train
[2024/07/17 07:10:35] ppocr INFO: ext_op_transform_idx : 1
[2024/07/17 07:10:35] ppocr INFO: label_file_list : ['/content/drive/MyDrive/Paddle_OCR/Label_OCR/Train_data/rec/rec_train.txt']
[2024/07/17 07:10:35] ppocr INFO: name : SimpleDataSet
[2024/07/17 07:10:35] ppocr INFO: transforms :
[2024/07/17 07:10:35] ppocr INFO: DecodeImage :
[2024/07/17 07:10:35] ppocr INFO: channel_first : False
[2024/07/17 07:10:35] ppocr INFO: img_mode : BGR
[2024/07/17 07:10:35] ppocr INFO: RecConAug :
[2024/07/17 07:10:35] ppocr INFO: ext_data_num : 2
[2024/07/17 07:10:35] ppocr INFO: image_shape : [48, 320, 3]
[2024/07/17 07:10:35] ppocr INFO: max_text_length : 50
[2024/07/17 07:10:35] ppocr INFO: prob : 0.5
[2024/07/17 07:10:35] ppocr INFO: RecAug : None
[2024/07/17 07:10:35] ppocr INFO: MultiLabelEncode : None
[2024/07/17 07:10:35] ppocr INFO: RecResizeImg :
[2024/07/17 07:10:35] ppocr INFO: image_shape : [3, 48, 320]
[2024/07/17 07:10:35] ppocr INFO: KeepKeys :
[2024/07/17 07:10:35] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio']
[2024/07/17 07:10:35] ppocr INFO: loader :
[2024/07/17 07:10:35] ppocr INFO: batch_size_per_card : 128
[2024/07/17 07:10:35] ppocr INFO: drop_last : True
[2024/07/17 07:10:35] ppocr INFO: num_workers : 4
[2024/07/17 07:10:35] ppocr INFO: shuffle : True
[2024/07/17 07:10:35] ppocr INFO: profiler_options : None
[2024/07/17 07:10:35] ppocr INFO: train with paddle 2.6.1 and device Place(gpu:0)
[2024/07/17 07:10:35] ppocr INFO: Initialize indexs of datasets:['/content/drive/MyDrive/Paddle_OCR/Label_OCR/Train_data/rec/rec_train.txt']
list index out of range
I have tried implementing the solutions mentions above but the issue still persist. Is there any any other configurations I have to consider to get going.
[2024/07/17 07:10:35] ppocr INFO: Initialize indexs of datasets:['/content/drive/MyDrive/Paddle_OCR/Label_OCR/Train_data/rec/rec_train.txt']
list index out of range
Is this where the trace ends? This exception usually doesn't stop the code in my experience.
@aspaul20 log.txt This is the complete log file.
@MaroofAbdullah
Line 128
[2024/07/18 14:36:38] ppocr ERROR: When parsing line Train_data/rec/Crop_data/crop_train_img/img330_crop_0.jpg IN
, error happened with msg: Traceback (most recent call last):
File "D:\Label OCR\PaddleOCR-main\ppocr\data\simple_dataset.py", line 157, in __getitem__
raise Exception("{} does not exist!".format(img_path))
Exception: Train_data/rec/Crop_data/crop_train_img\Train_data/rec/Crop_data/crop_train_img/img330_crop_0.jpg does not exist!
Your paths are incorrect/the files aren't at those paths.
问题描述 / Problem Description
PaddleOCR recognition model won't train on images that are very wide. For instance, when trying to train on image(s) as big as 2834x210 pixels, the dataloader gets stuck for a long time and then throws a recursion error (given below). If I put these images in a bigger dataset with images of smaller widths, the dataloader only returns the smaller width images and the wider ones are ignored.
运行环境 / Runtime Environment
复现代码 / Reproduction Code
Run train.py on configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml, use any image with any label that is of the above size.
完整报错 / Complete Error Message
可能解决方案 / Possible solutions
Is this a limitation of Paddle? How can I fix this? Do I need word level dataset or can I train recognition on images of sentences (direct inference on a sentence seems to work, training should be possible too)
附件 / Appendix