Closed suyogsatyal closed 2 years ago
Note: by default, the image path and image label are split with \t, if you use other methods to split, it will cause a training error
Since you haven\'t replied for more than 3 months, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于您超过三个月未回复,我们将关闭这个issue/pr。 若问题未解决或有后续问题,请随时重新打开(建议先拉取最新代码进行尝试),我们会继续跟进。
Note: by default, the image path and image label are split with \t, if you use other methods to split, it will cause a training error
@LDOUBLEV even after that, i am getting the same error
I have been trying to train PaddleOCR using NIST Special Database 19 for handwritten text. I have followed all the steps shown in the git page except for text detection training. After using "python tools/train.py -c configs/rec/rec_icdar15_train.yml" on the Conda Prompt, "label = substr[1] IndexError: list index out of range" message is shown.
The full output is given below
Please Help
(paddle_env) D:\paddle\PaddleOCR>python tools/train.py -c configs/rec/rec_icdar15_train.yml [2022/03/18 14:39:32] root INFO: Architecture : [2022/03/18 14:39:32] root INFO: Backbone : [2022/03/18 14:39:32] root INFO: model_name : large [2022/03/18 14:39:32] root INFO: name : MobileNetV3 [2022/03/18 14:39:32] root INFO: scale : 0.5 [2022/03/18 14:39:32] root INFO: Head : [2022/03/18 14:39:32] root INFO: fc_decay : 0 [2022/03/18 14:39:32] root INFO: name : CTCHead [2022/03/18 14:39:32] root INFO: Neck : [2022/03/18 14:39:32] root INFO: encoder_type : rnn [2022/03/18 14:39:32] root INFO: hidden_size : 96 [2022/03/18 14:39:32] root INFO: name : SequenceEncoder [2022/03/18 14:39:32] root INFO: Transform : None [2022/03/18 14:39:32] root INFO: algorithm : CRNN [2022/03/18 14:39:32] root INFO: model_type : rec [2022/03/18 14:39:32] root INFO: Eval : [2022/03/18 14:39:32] root INFO: dataset : [2022/03/18 14:39:32] root INFO: data_dir : ./train_data/ic15_data [2022/03/18 14:39:32] root INFO: label_file_list : ['./train_data/rec/ic15_data/rec_gt_test.txt'] [2022/03/18 14:39:32] root INFO: name : SimpleDataSet [2022/03/18 14:39:32] root INFO: transforms : [2022/03/18 14:39:32] root INFO: DecodeImage : [2022/03/18 14:39:32] root INFO: channel_first : False [2022/03/18 14:39:32] root INFO: img_mode : BGR [2022/03/18 14:39:32] root INFO: CTCLabelEncode : None [2022/03/18 14:39:32] root INFO: RecResizeImg : [2022/03/18 14:39:32] root INFO: image_shape : [3, 32, 100] [2022/03/18 14:39:32] root INFO: KeepKeys : [2022/03/18 14:39:32] root INFO: keep_keys : ['image', 'label', 'length'] [2022/03/18 14:39:32] root INFO: loader : [2022/03/18 14:39:32] root INFO: batch_size_per_card : 256 [2022/03/18 14:39:32] root INFO: drop_last : False [2022/03/18 14:39:32] root INFO: num_workers : 4 [2022/03/18 14:39:32] root INFO: shuffle : False [2022/03/18 14:39:32] root INFO: use_shared_memory : False [2022/03/18 14:39:32] root INFO: Global : [2022/03/18 14:39:32] root INFO: cal_metric_during_train : True [2022/03/18 14:39:32] root INFO: character_dict_path : ppocr/utils/en_dict.txt [2022/03/18 14:39:32] root INFO: checkpoints : None [2022/03/18 14:39:32] root INFO: debug : False [2022/03/18 14:39:32] root INFO: distributed : False [2022/03/18 14:39:32] root INFO: epoch_num : 72 [2022/03/18 14:39:32] root INFO: eval_batch_step : [0, 2000] [2022/03/18 14:39:32] root INFO: infer_img : doc/imgs_words_en/word_10.png [2022/03/18 14:39:32] root INFO: infer_mode : False [2022/03/18 14:39:32] root INFO: log_smooth_window : 20 [2022/03/18 14:39:32] root INFO: max_text_length : 25 [2022/03/18 14:39:32] root INFO: pretrained_model : None [2022/03/18 14:39:32] root INFO: print_batch_step : 10 [2022/03/18 14:39:32] root INFO: save_epoch_step : 3 [2022/03/18 14:39:32] root INFO: save_inference_dir : ./ [2022/03/18 14:39:32] root INFO: save_model_dir : ./output/rec/ic15/ [2022/03/18 14:39:32] root INFO: save_res_path : ./output/rec/predicts_ic15.txt [2022/03/18 14:39:32] root INFO: use_gpu : False [2022/03/18 14:39:32] root INFO: use_space_char : False [2022/03/18 14:39:32] root INFO: use_visualdl : False [2022/03/18 14:39:32] root INFO: Loss : [2022/03/18 14:39:32] root INFO: name : CTCLoss [2022/03/18 14:39:32] root INFO: Metric : [2022/03/18 14:39:32] root INFO: main_indicator : acc [2022/03/18 14:39:32] root INFO: name : RecMetric [2022/03/18 14:39:32] root INFO: Optimizer : [2022/03/18 14:39:32] root INFO: beta1 : 0.9 [2022/03/18 14:39:32] root INFO: beta2 : 0.999 [2022/03/18 14:39:32] root INFO: lr : [2022/03/18 14:39:32] root INFO: learning_rate : 0.0005 [2022/03/18 14:39:32] root INFO: name : Adam [2022/03/18 14:39:32] root INFO: regularizer : [2022/03/18 14:39:32] root INFO: factor : 0 [2022/03/18 14:39:32] root INFO: name : L2 [2022/03/18 14:39:32] root INFO: PostProcess : [2022/03/18 14:39:32] root INFO: name : CTCLabelDecode [2022/03/18 14:39:32] root INFO: Train : [2022/03/18 14:39:32] root INFO: dataset : [2022/03/18 14:39:32] root INFO: data_dir : ./train_data/ic15_data/ [2022/03/18 14:39:32] root INFO: label_file_list : ['./train_data/rec/rec_gt_train.txt'] [2022/03/18 14:39:32] root INFO: name : SimpleDataSet [2022/03/18 14:39:32] root INFO: transforms : [2022/03/18 14:39:32] root INFO: DecodeImage : [2022/03/18 14:39:32] root INFO: channel_first : False [2022/03/18 14:39:32] root INFO: img_mode : BGR [2022/03/18 14:39:32] root INFO: CTCLabelEncode : None [2022/03/18 14:39:32] root INFO: RecResizeImg : [2022/03/18 14:39:32] root INFO: image_shape : [3, 32, 100] [2022/03/18 14:39:32] root INFO: KeepKeys : [2022/03/18 14:39:32] root INFO: keep_keys : ['image', 'label', 'length'] [2022/03/18 14:39:32] root INFO: loader : [2022/03/18 14:39:32] root INFO: batch_size_per_card : 256 [2022/03/18 14:39:32] root INFO: drop_last : True [2022/03/18 14:39:32] root INFO: num_workers : 8 [2022/03/18 14:39:32] root INFO: shuffle : True [2022/03/18 14:39:32] root INFO: use_shared_memory : False [2022/03/18 14:39:32] root INFO: profiler_options : None [2022/03/18 14:39:32] root INFO: train with paddle 2.2.2 and device CPUPlace [2022/03/18 14:39:32] root INFO: Initialize indexs of datasets:['./train_data/rec/rec_gt_train.txt'] [2022/03/18 14:39:34] root INFO: Initialize indexs of datasets:['./train_data/rec/ic15_data/rec_gt_test.txt'] [2022/03/18 14:39:35] root INFO: train from scratch [2022/03/18 14:39:35] root INFO: train dataloader has 2849 iters [2022/03/18 14:39:35] root INFO: valid dataloader has 73 iters [2022/03/18 14:39:35] root INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations [2022/03/18 14:39:35] root INFO: Initialize indexs of datasets:['./train_data/rec/rec_gt_train.txt'] [2022/03/18 14:39:36] root ERROR: When parsing line train_55_12387.png, "U" , error happened with msg: Traceback (most recent call last): File "D:\paddle\PaddleOCR\ppocr\data\simple_dataset.py", line 110, in getitem label = substr[1] IndexError: list index out of range
[2022/03/18 14:39:36] root ERROR: When parsing line train_53_18903.png, "S" , error happened with msg: Traceback (most recent call last): File "D:\paddle\PaddleOCR\ppocr\data\simple_dataset.py", line 110, in getitem label = substr[1] IndexError: list index out of range
[2022/03/18 14:39:36] root ERROR: When parsing line train_32_21270.png, "2" , error happened with msg: Traceback (most recent call last): File "D:\paddle\PaddleOCR\ppocr\data\simple_dataset.py", line 110, in getitem label = substr[1] IndexError: list index out of range
*the same message repeats over thousand times*****
Exception in thread Thread-3: Traceback (most recent call last): File "D:\paddle\PaddleOCR\ppocr\data\simple_dataset.py", line 110, in getitem label = substr[1] IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:\anaconda3\envs\paddle_env\lib\threading.py", line 932, in _bootstrap_inner self.run() File "C:\anaconda3\envs\paddle_env\lib\threading.py", line 870, in run self._target(*self._args, self._kwargs) File "C:\anaconda3\envs\paddle_env\lib\site-packages\paddle\fluid\dataloader\dataloader_iter.py", line 212, in _thread_loop batch = self._dataset_fetcher.fetch(indices, File "C:\anaconda3\envs\paddle_env\lib\site-packages\paddle\fluid\dataloader\fetcher.py", line 121, in fetch data.append(self.dataset[idx]) File "D:\paddle\PaddleOCR\ppocr\data\simple_dataset.py", line 129, in getitem return self.getitem(rnd_idx) File "D:\paddle\PaddleOCR\ppocr\data\simple_dataset.py", line 129, in getitem return self.getitem(rnd_idx) File "D:\paddle\PaddleOCR\ppocr\data\simple_dataset.py", line 129, in getitem return self.getitem(rnd_idx) [Previous line repeated 979 more times] File "D:\paddle\PaddleOCR\ppocr\data\simple_dataset.py", line 121, in getitem self.logger.error( File "C:\anaconda3\envs\paddle_env\lib\logging__init__.py", line 1475, in error self._log(ERROR, msg, args, kwargs) File "C:\anaconda3\envs\paddle_env\lib\logging__init.py", line 1589, in _log self.handle(record) File "C:\anaconda3\envs\paddle_env\lib\logging__init.py", line 1599, in handle self.callHandlers(record) File "C:\anaconda3\envs\paddle_env\lib\logging__init__.py", line 1661, in callHandlers hdlr.handle(record) File "C:\anaconda3\envs\paddle_env\lib\logging\init__.py", line 954, in handle self.emit(record) File "C:\anaconda3\envs\paddle_env\lib\logging\init.py", line 1088, in emit stream.write(msg + self.terminator) File "C:\Users\user\AppData\Roaming\Python\Python38\site-packages\colorama\ansitowin32.py", line 41, in write self.convertor.write(text) File "C:\Users\user\AppData\Roaming\Python\Python38\site-packages\colorama\ansitowin32.py", line 162, in write self.write_and_convert(text) File "C:\Users\user\AppData\Roaming\Python\Python38\site-packages\colorama\ansitowin32.py", line 190, in write_and_convert self.write_plain_text(text, cursor, len(text)) File "C:\Users\user\AppData\Roaming\Python\Python38\site-packages\colorama\ansitowin32.py", line 195, in write_plain_text self.wrapped.write(text[start:end]) RecursionError: maximum recursion depth exceeded while calling a Python object