PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
42.44k stars 7.66k forks source link

使用ABInet训练时报错 #10542

Closed jinxiqinghuan closed 1 year ago

jinxiqinghuan commented 1 year ago

我想要识别的文本比较长,所以想尝试将max_text_length设置为一个更大的数,但是改变该值后(默认为25,修改成50后),训练报错:

Traceback (most recent call last):
  File "PaddleOCR/tools/train.py", line 208, in <module>
    main(config, device, logger, vdl_writer)
  File "PaddleOCR/tools/train.py", line 180, in main
    program.train(config, train_dataloader, valid_dataloader, device, model,
  File "/home/data/experiment/ocr/text/PaddleOCR/tools/program.py", line 294, in train
    preds = model(images)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/parallel.py", line 774, in forward
    outputs = self._layers(*inputs, **kwargs)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/data/experiment/ocr/text/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 100, in forward
    x = self.head(x, targets=data)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/data/experiment/ocr/text/PaddleOCR/ppocr/modeling/heads/rec_abinet_head.py", line 228, in forward
    v_feature, attn_scores = self.decoder(
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/data/experiment/ocr/text/PaddleOCR/ppocr/modeling/heads/rec_abinet_head.py", line 169, in forward
    attn_vecs = attn_scores @v  # (B, N, C)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/math_op_patch.py", line 298, in __impl__
    return math_op(self, other_var, False, False)
ValueError: (InvalidArgument) Input(Y) has error dim.Y'dims[1] must be equal to 256But received Y'dims[1] is 160
  [Hint: Expected y_dims[y_ndim - 2] == K, but received y_dims[y_ndim - 2]:160 != K:256.] (at /paddle/paddle/phi/kernels/impl/matmul_kernel_impl.h:314)

gpu131:248326:248326 [2] NCCL INFO comm 0x36a21810 rank 0 nranks 2 cudaDev 2 busId b1000 - Destroy COMPLETE
I0803 13:48:13.335893 248399 tcp_store.cc:257] receive shutdown event and so quit from MasterDaemon run loop
LAUNCH INFO 2023-08-03 13:48:15,249 Exit code 1

图片大小默认的也不符合要求,修改后(默认为[3, 32, 128]修改为[3, 32, 512])也会报错:

Traceback (most recent call last):
  File "PaddleOCR/tools/train.py", line 208, in <module>
    main(config, device, logger, vdl_writer)
  File "PaddleOCR/tools/train.py", line 180, in main
    program.train(config, train_dataloader, valid_dataloader, device, model,
  File "/home/data/experiment/ocr/text/PaddleOCR/tools/program.py", line 294, in train
    preds = model(images)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/parallel.py", line 774, in forward
    outputs = self._layers(*inputs, **kwargs)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/data/experiment/ocr/text/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 100, in forward
    x = self.head(x, targets=data)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/data/experiment/ocr/text/PaddleOCR/ppocr/modeling/heads/rec_abinet_head.py", line 224, in forward
    feature = self.pos_encoder(feature)
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/data/experiment/ocr/text/PaddleOCR/ppocr/modeling/heads/rec_nrtr_head.py", line 500, in forward
    x = x + self.pe[:paddle.shape(x)[0], :]
  File "/home/opt/anaconda3/envs/doc-ocr/lib/python3.8/site-packages/paddle/fluid/dygraph/math_op_patch.py", line 304, in __impl__
    return math_op(self, other_var, -1)
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1024, 16, 512] and the shape of Y = [256, 1, 512]. Received [1024] in X is not equal to [256] in Y at i:0.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/phi/kernels/funcs/common_shape.h:84)

gpu131:250597:250597 [2] NCCL INFO comm 0x3864b5a0 rank 0 nranks 2 cudaDev 2 busId b1000 - Destroy COMPLETE
I0803 13:56:35.494895 250668 tcp_store.cc:257] receive shutdown event and so quit from MasterDaemon run loop
LAUNCH INFO 2023-08-03 13:56:38,031 Exit code 1
jinxiqinghuan commented 1 year ago

网络好像不会根据该参数更新,max_text_lengthimage_shape这两个参数好像是被写死是,尝试去修改了,报错位置的代码,可以跑起来,但是不知道改的对不对。还是说,这两个参数是固定的,不能被修改呢?谢谢各位大佬

xlg-go commented 1 year ago

网络好像不会根据该参数更新,max_text_lengthimage_shape这两个参数好像是被写死是,尝试去修改了,报错位置的代码,可以跑起来,但是不知道改的对不对。还是说,这两个参数是固定的,不能被修改呢?谢谢各位大佬

请问你的问题解决没有?可否借鉴一下?

参考 #8549

jinxiqinghuan commented 1 year ago

明白了,感谢大佬!

xlg-go commented 1 year ago

明白了,感谢大佬!

不客气

aafkari74 commented 11 months ago

明白了,感谢大佬!

hello sir. did your problem solved? i have faced the same error as you faced. i have a dataset with 32*128 images.

xlg-go commented 11 months ago

明白了,感谢大佬!

hello sir. did your problem solved? i have faced the same error as you faced. i have a dataset with 32*128 images.

Use dygraph branch!

aliiafkari commented 11 months ago

明白了,感谢大佬!

hello sir. did your problem solved? i have faced the same error as you faced. i have a dataset with 32*128 images.

Use dygraph branch!

hello sir, i have tried to get the ABInet recognition working for Farsi language. however, the error that i'm getting is as below. can you please help me to solve the issue. your help and time is much appreciated. [2023/09/27 17:30:29] ppocr INFO: Architecture : [2023/09/27 17:30:29] ppocr INFO: Backbone : [2023/09/27 17:30:29] ppocr INFO: name : ResNet45 [2023/09/27 17:30:29] ppocr INFO: Head : [2023/09/27 17:30:29] ppocr INFO: iter_size : 3 [2023/09/27 17:30:29] ppocr INFO: max_length : 25 [2023/09/27 17:30:29] ppocr INFO: name : ABINetHead [2023/09/27 17:30:29] ppocr INFO: use_lang : True [2023/09/27 17:30:29] ppocr INFO: Transform : None [2023/09/27 17:30:29] ppocr INFO: algorithm : ABINet [2023/09/27 17:30:29] ppocr INFO: in_channels : 3 [2023/09/27 17:30:29] ppocr INFO: model_type : rec [2023/09/27 17:30:29] ppocr INFO: Eval : [2023/09/27 17:30:29] ppocr INFO: dataset : [2023/09/27 17:30:29] ppocr INFO: data_dir : /media/admin-pc-nezamabadi/Students/Dataset-IDPL-PFOD2-merge- with DR/resized/val [2023/09/27 17:30:29] ppocr INFO: label_file_list : ['/media/admin-pc-nezamabadi/Students/Dataset-IDPL-PFOD2-merge- with DR/val.txt'] [2023/09/27 17:30:29] ppocr INFO: name : SimpleDataSet [2023/09/27 17:30:29] ppocr INFO: transforms : [2023/09/27 17:30:29] ppocr INFO: DecodeImage : [2023/09/27 17:30:29] ppocr INFO: channel_first : False [2023/09/27 17:30:29] ppocr INFO: img_mode : BGR [2023/09/27 17:30:29] ppocr INFO: MultiLabelEncode : None [2023/09/27 17:30:29] ppocr INFO: RecResizeImg : [2023/09/27 17:30:29] ppocr INFO: image_shape : [3, 32, 128] [2023/09/27 17:30:29] ppocr INFO: KeepKeys : [2023/09/27 17:30:29] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio'] [2023/09/27 17:30:29] ppocr INFO: loader : [2023/09/27 17:30:29] ppocr INFO: batch_size_per_card : 64 [2023/09/27 17:30:29] ppocr INFO: drop_last : False [2023/09/27 17:30:29] ppocr INFO: num_workers : 24 [2023/09/27 17:30:29] ppocr INFO: shuffle : False [2023/09/27 17:30:29] ppocr INFO: Global : [2023/09/27 17:30:29] ppocr INFO: cal_metric_during_train : True [2023/09/27 17:30:29] ppocr INFO: character_dict_path : /media/admin-pc-nezamabadi/Students/Dataset-IDPL-PFOD2-merge- with DR/char.txt [2023/09/27 17:30:29] ppocr INFO: character_type : fa [2023/09/27 17:30:29] ppocr INFO: checkpoints : None [2023/09/27 17:30:29] ppocr INFO: distributed : False [2023/09/27 17:30:29] ppocr INFO: epoch_num : 1500 [2023/09/27 17:30:29] ppocr INFO: eval_batch_step : [0, 2000] [2023/09/27 17:30:29] ppocr INFO: infer_img : None [2023/09/27 17:30:29] ppocr INFO: infer_mode : False [2023/09/27 17:30:29] ppocr INFO: log_smooth_window : 20 [2023/09/27 17:30:29] ppocr INFO: max_text_length : 25 [2023/09/27 17:30:29] ppocr INFO: pretrained_model : /home/admin-pc-nezamabadi/paddelpaddle-env/output-IDPL2-ABInet-06/best_accuracy [2023/09/27 17:30:29] ppocr INFO: print_batch_step : 10 [2023/09/27 17:30:29] ppocr INFO: save_epoch_step : 100 [2023/09/27 17:30:29] ppocr INFO: save_inference_dir : None [2023/09/27 17:30:29] ppocr INFO: save_model_dir : /home/admin-pc-nezamabadi/paddelpaddle-env/output-IDPL2-ABInet-06/ [2023/09/27 17:30:29] ppocr INFO: save_res_path : /home/admin-pc-nezamabadi/paddelpaddle-env/output-IDPL2-ABInet-06/rec/predicts_abinet.txt [2023/09/27 17:30:29] ppocr INFO: use_gpu : True [2023/09/27 17:30:29] ppocr INFO: use_space_char : True [2023/09/27 17:30:29] ppocr INFO: use_visualdl : False [2023/09/27 17:30:29] ppocr INFO: Loss : [2023/09/27 17:30:29] ppocr INFO: name : CELoss [2023/09/27 17:30:29] ppocr INFO: Metric : [2023/09/27 17:30:29] ppocr INFO: main_indicator : acc [2023/09/27 17:30:29] ppocr INFO: name : RecMetric [2023/09/27 17:30:29] ppocr INFO: Optimizer : [2023/09/27 17:30:29] ppocr INFO: beta1 : 0.9 [2023/09/27 17:30:29] ppocr INFO: beta2 : 0.99 [2023/09/27 17:30:29] ppocr INFO: clip_norm : 20.0 [2023/09/27 17:30:29] ppocr INFO: lr : [2023/09/27 17:30:29] ppocr INFO: decay_epochs : [6] [2023/09/27 17:30:29] ppocr INFO: name : Piecewise [2023/09/27 17:30:29] ppocr INFO: values : [0.0001, 1e-05] [2023/09/27 17:30:29] ppocr INFO: name : Adam [2023/09/27 17:30:29] ppocr INFO: regularizer : [2023/09/27 17:30:29] ppocr INFO: factor : 0.0 [2023/09/27 17:30:29] ppocr INFO: name : L2 [2023/09/27 17:30:29] ppocr INFO: PostProcess : [2023/09/27 17:30:29] ppocr INFO: name : ABINetLabelDecode [2023/09/27 17:30:29] ppocr INFO: Train : [2023/09/27 17:30:29] ppocr INFO: dataset : [2023/09/27 17:30:29] ppocr INFO: data_dir : /media/admin-pc-nezamabadi/Students/Dataset-IDPL-PFOD2-merge- with DR/resized/train [2023/09/27 17:30:29] ppocr INFO: ext_op_transform_idx : 1 [2023/09/27 17:30:29] ppocr INFO: label_file_list : ['/media/admin-pc-nezamabadi/Students/Dataset-IDPL-PFOD2-merge- with DR/train.txt'] [2023/09/27 17:30:29] ppocr INFO: name : SimpleDataSet [2023/09/27 17:30:29] ppocr INFO: transforms : [2023/09/27 17:30:29] ppocr INFO: DecodeImage : [2023/09/27 17:30:29] ppocr INFO: channel_first : False [2023/09/27 17:30:29] ppocr INFO: img_mode : BGR [2023/09/27 17:30:29] ppocr INFO: RecConAug : [2023/09/27 17:30:29] ppocr INFO: ext_data_num : 2 [2023/09/27 17:30:29] ppocr INFO: image_shape : [32, 128, 3] [2023/09/27 17:30:29] ppocr INFO: prob : 0.5 [2023/09/27 17:30:29] ppocr INFO: RecAug : None [2023/09/27 17:30:29] ppocr INFO: MultiLabelEncode : None [2023/09/27 17:30:29] ppocr INFO: RecResizeImg : [2023/09/27 17:30:29] ppocr INFO: image_shape : [3, 32, 128] [2023/09/27 17:30:29] ppocr INFO: KeepKeys : [2023/09/27 17:30:29] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio'] [2023/09/27 17:30:29] ppocr INFO: loader : [2023/09/27 17:30:29] ppocr INFO: batch_size_per_card : 64 [2023/09/27 17:30:29] ppocr INFO: drop_last : True [2023/09/27 17:30:29] ppocr INFO: num_workers : 48 [2023/09/27 17:30:29] ppocr INFO: shuffle : True [2023/09/27 17:30:29] ppocr INFO: profiler_options : None [2023/09/27 17:30:29] ppocr INFO: train with paddle 2.4.2 and device Place(gpu:0) [2023/09/27 17:30:29] ppocr INFO: Initialize indexs of datasets:['/media/admin-pc-nezamabadi/Students/Dataset-IDPL-PFOD2-merge- with DR/train.txt'] [2023/09/27 17:30:30] ppocr INFO: Initialize indexs of datasets:['/media/admin-pc-nezamabadi/Students/Dataset-IDPL-PFOD2-merge- with DR/val.txt'] W0927 17:30:30.723640 5592 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.7 W0927 17:30:30.739579 5592 gpu_resources.cc:91] device: 0, cuDNN Version: 8.6. [2023/09/27 17:30:32] ppocr INFO: train dataloader has 20291 iters [2023/09/27 17:30:32] ppocr INFO: valid dataloader has 5060 iters [2023/09/27 17:30:32] ppocr WARNING: The shape of model params head.cls.weight [512, 157] not matched with loaded params head.cls.weight [512, 37] ! [2023/09/27 17:30:32] ppocr WARNING: The shape of model params head.cls.bias [157] not matched with loaded params head.cls.bias [37] ! [2023/09/27 17:30:32] ppocr WARNING: The shape of model params head.language.proj.weight [157, 512] not matched with loaded params head.language.proj.weight [37, 512] ! [2023/09/27 17:30:32] ppocr WARNING: The shape of model params head.language.cls.weight [512, 157] not matched with loaded params head.language.cls.weight [512, 37] ! [2023/09/27 17:30:32] ppocr WARNING: The shape of model params head.language.cls.bias [157] not matched with loaded params head.language.cls.bias [37] ! [2023/09/27 17:30:32] ppocr WARNING: The shape of model params head.cls_align.weight [512, 157] not matched with loaded params head.cls_align.weight [512, 37] ! [2023/09/27 17:30:32] ppocr WARNING: The shape of model params head.cls_align.bias [157] not matched with loaded params head.cls_align.bias [37] ! [2023/09/27 17:30:32] ppocr INFO: load pretrain successful from /home/admin-pc-nezamabadi/paddelpaddle-env/output-IDPL2-ABInet-06/best_accuracy [2023/09/27 17:30:32] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations Traceback (most recent call last): File "/home/admin-pc-nezamabadi/paddelpaddle-env/PaddleOCR/tools/train.py", line 208, in main(config, device, logger, vdl_writer) File "/home/admin-pc-nezamabadi/paddelpaddle-env/PaddleOCR/tools/train.py", line 180, in main program.train(config, train_dataloader, valid_dataloader, device, model, File "/home/admin-pc-nezamabadi/paddelpaddle-env/PaddleOCR/tools/program.py", line 295, in train loss = loss_class(preds, batch) File "/home/admin-pc-nezamabadi/paddelpaddle-env/lib/python3.10/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in call return self.forward(*inputs, *kwargs) File "/home/admin-pc-nezamabadi/paddelpaddle-env/PaddleOCR/ppocr/losses/rec_ce_loss.py", line 36, in forward loss[name + '_loss'] = self.loss_func(flt_logtis, flt_tgt) File "/home/admin-pc-nezamabadi/paddelpaddle-env/lib/python3.10/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in call return self.forward(inputs, **kwargs) File "/home/admin-pc-nezamabadi/paddelpaddle-env/lib/python3.10/site-packages/paddle/nn/layer/loss.py", line 377, in forward ret = paddle.nn.functional.cross_entropy( File "/home/admin-pc-nezamabadi/paddelpaddle-env/lib/python3.10/site-packages/paddle/nn/functional/loss.py", line 2566, in crossentropy , out = _C_ops.cross_entropy_with_softmax( ValueError: (InvalidArgument) Input(Logits) and Input(Label) should in same shape in dimensions except axis.

aliiafkari commented 11 months ago

@yong-asial @andyjpaddle @jinxiqinghuan hello, i get the above error while trying to train the ''ABInet '' recognition model. have you guys any solution to solve this issue? thanks in advance.

Homura852 commented 5 months ago

兄弟你好,请问你的ABInet训练成功了吗,能否加个联系方式交流一下呢

aliiafkari commented 5 months ago

兄弟你好,请问你的ABInet训练成功了吗,能否加个联系方式交流一下呢

hello my dear friend.

yes i have trained the ABInet model on my own dataset.

feel free to contact me via the following email: a.afkari74@gmail.com

Homura852 commented 5 months ago

兄弟你好,请问你的ABInet训练成功了吗,能否加个联系方式交流一下呢

hello my dear friend.

yes i have trained the ABInet model on my own dataset.

feel free to contact me via the following email: a.afkari74@gmail.com

Thank you《 my dear friend, I have trained the ABInet model so far, thank you for your help, if I have any other questions in the future, I will contact you through your email

aliiafkari commented 5 months ago

兄弟你好,请问你的ABInet训练成功了吗,能否加个联系方式交流一下呢

hello my dear friend. yes i have trained the ABInet model on my own dataset. feel free to contact me via the following email: a.afkari74@gmail.com

Thank you《 my dear friend, I have trained the ABInet model so far, thank you for your help, if I have any other questions in the future, I will contact you through your email

sure, happy to help :).