PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.05k stars 7.81k forks source link

ValueError: (InvalidArgument) shape should have the save dim with perm, but received shape size is:4, perm size is:3. #13874

Closed letienan1998 closed 1 month ago

letienan1998 commented 1 month ago

🔎 Search before asking

🐛 Bug (问题描述)

I'm training a new language with model SAR and stuck in predict_rec.py I run command: python3 tools/infer/predict_rec.py --image_dir="./predict/0.jpg" --rec_model_dir="./inference/rec/sar" --rec_image_shape="3, 100, 100,320" --rec_char_dict_path="./ppocr/utils/vi_dict.txt" it show me: [2024/09/15 21:37:28] ppocr WARNING: The first GPU is used for inference by default, GPU ID: 0 [2024/09/15 21:37:35] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320 Backend GTK4Agg is interactive backend. Turning interactive mode on. [2024/09/15 21:37:46] ppocr INFO: Traceback (most recent call last): File "/home/workspace/PaddleOCR/tools/infer/predict_rec.py", line 860, in main recres, = text_recognizer(img_list) File "/home/workspace/PaddleOCR/tools/infer/predict_rec.py", line 706, in call self.predictor.run() ValueError: (InvalidArgument) shape should have the save dim with perm, but received shape size is:4, perm size is:3. [Hint: Expected shape.size() == perm.size(), but received shape.size():4 != perm.size():3.] (at /paddle/paddle/phi/kernels/funcs/transpose_function.cu.h:663) [operator < transpose2 > error]

[2024/09/15 21:37:46] ppocr INFO: (InvalidArgument) shape should have the save dim with perm, but received shape size is:4, perm size is:3. [Hint: Expected shape.size() == perm.size(), but received shape.size():4 != perm.size():3.] (at /paddle/paddle/phi/kernels/funcs/transpose_function.cu.h:663) [operator < transpose2 > error]

🏃‍♂️ Environment (运行环境)

My enviroment

OS: ubuntu 22.04
python 3.10.12
paddleocr & paddlepaddle: 2.6.1
CPU: i5 12400F
RAM: 32GB
GPU: GTX 1060 6G

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

Thank you for any help

letienan1998 commented 1 month ago

Addition Info: I'm using: CUDA 8.5

jingsongliujing commented 1 month ago

The parameter you provided --rec_image_shape="3, 100, 100,320" contains four dimensions, while the error message indicates that the model expects an input shape with three dimensions. You can try changing the --rec_image_shape parameter to "3, 32, 320", which removes the extra dimension.

letienan1998 commented 1 month ago

Thank you for your help. But when i set --rec_image_shape parameter to "3, 32, 320". I get new error

python3 tools/infer/predict_rec.py --image_dir="./predict/0.jpg" --rec_model_dir="./inference/rec/sar" --rec_image_shape="3, 100,320" --rec_char_dict_path="./ppocr/utils/vi_dict.txt"
[2024/09/18 21:11:10] ppocr WARNING: The first GPU is used for inference by default, GPU ID: 0
[2024/09/18 21:11:17] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320
[2024/09/18 21:11:19] ppocr INFO: Traceback (most recent call last):
  File "/home/workspace/PaddleOCR/tools/infer/predict_rec.py", line 860, in main
    rec_res, _ = text_recognizer(img_list)
  File "/home/workspace/PaddleOCR/tools/infer/predict_rec.py", line 793, in __call__
    self.predictor.run()
ValueError: (InvalidArgument) The size of Op(Conv) inputs should not be 0.
  [Hint: Expected in_dims[i] != 0, but received in_dims[i]:0 == 0:0.] (at /paddle/paddle/phi/infermeta/binary.cc:494)
  [operator < conv2d > error]

[2024/09/18 21:11:19] ppocr INFO: (InvalidArgument) The size of Op(Conv) inputs should not be 0.
  [Hint: Expected in_dims[i] != 0, but received in_dims[i]:0 == 0:0.] (at /paddle/paddle/phi/infermeta/binary.cc:494)
  [operator < conv2d > error]

image i'm using for predict is the image i ran for infer.

jingsongliujing commented 1 month ago

about sar doc:https://paddlepaddle.github.io/PaddleOCR/en/algorithm/text_recognition/algorithm_rec_sar.html#41-python-inference

letienan1998 commented 1 month ago

Thank you very much! It's worked with shape 3x48x48x160! But i trained with shape 3x100x320 and exported the model. I think i can use that shape.