PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.57k stars 7.85k forks source link

CPU版本版面分析调试'list' object has no attribute 'shape' #13194

Closed Arcolaus closed 5 months ago

Arcolaus commented 5 months ago

问题描述 / Problem Description

在进行版面恢复时,出现属性调用错误

运行环境 / Runtime Environment

复现代码 / Reproduction Code

改自实例代码 2.2.3 版面分析提供的代码:

import os
import cv2
from paddleocr import PPStructure,save_structure_res
import numpy as np

ocr_engine = PPStructure(table=True, ocr=True, show_log=True)

save_folder = './output_other'
img_path = 'sx.pdf'
result = ocr_engine(img_path)
for index, res in enumerate(result):
    save_structure_res(res, save_folder, os.path.basename(img_path).split('.')[0], index)

for res in result:
    for line in res:
        line.pop('img')
        print(line)

完整报错 / Complete Error Message

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[4], [line 10](vscode-notebook-cell:?execution_count=4&line=10)
      [8](vscode-notebook-cell:?execution_count=4&line=8) save_folder = './output_other'
      [9](vscode-notebook-cell:?execution_count=4&line=9) img_path = 'sx.pdf'
---> [10](vscode-notebook-cell:?execution_count=4&line=10) result = ocr_engine(img_path)
     [11](vscode-notebook-cell:?execution_count=4&line=11) for index, res in enumerate(result):
     [12](vscode-notebook-cell:?execution_count=4&line=12)     save_structure_res(res, save_folder, os.path.basename(img_path).split('.')[0], index)

File e:\miniconda3\envs\paddle\lib\site-packages\paddleocr\paddleocr.py:766, in PPStructure.__call__(self, img, return_ocr_result_in_table, img_idx)
    [764](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/paddleocr.py:764) def __call__(self, img, return_ocr_result_in_table=False, img_idx=0):
    [765](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/paddleocr.py:765)     img = check_img(img)
--> [766](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/paddleocr.py:766)     res, _ = super().__call__(
    [767](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/paddleocr.py:767)         img, return_ocr_result_in_table, img_idx=img_idx)
    [768](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/paddleocr.py:768)     return res

File e:\miniconda3\envs\paddle\lib\site-packages\paddleocr\ppstructure\predict_system.py:112, in StructureSystem.__call__(self, img, return_ocr_result_in_table, img_idx)
    [110](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/predict_system.py:110) ori_im = img.copy()
    [111](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/predict_system.py:111) if self.layout_predictor is not None:
--> [112](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/predict_system.py:112)     layout_res, elapse = self.layout_predictor(img)
    [113](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/predict_system.py:113)     time_dict['layout'] += elapse
    [114](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/predict_system.py:114) else:

File e:\miniconda3\envs\paddle\lib\site-packages\paddleocr\ppstructure\layout\predict_layout.py:73, in LayoutPredictor.__call__(self, img)
     [71](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/layout/predict_layout.py:71) ori_im = img.copy()
     [72](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/layout/predict_layout.py:72) data = {'image': img}
---> [73](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/layout/predict_layout.py:73) data = transform(data, self.preprocess_op)
     [74](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/layout/predict_layout.py:74) img = data[0]
     [76](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppstructure/layout/predict_layout.py:76) if img is None:

File e:\miniconda3\envs\paddle\lib\site-packages\paddleocr\ppocr\data\imaug\__init__.py:56, in transform(data, ops)
     [54](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/__init__.py:54)     ops = []
     [55](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/__init__.py:55) for op in ops:
---> [56](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/__init__.py:56)     data = op(data)
     [57](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/__init__.py:57)     if data is None:
     [58](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/__init__.py:58)         return None

File e:\miniconda3\envs\paddle\lib\site-packages\paddleocr\ppocr\data\imaug\operators.py:192, in Resize.__call__(self, data)
    [189](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:189) if 'polys' in data:
    [190](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:190)     text_polys = data['polys']
--> [192](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:192) img_resize, [ratio_h, ratio_w] = self.resize_image(img)
    [193](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:193) if 'polys' in data:
    [194](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:194)     new_boxes = []

File e:\miniconda3\envs\paddle\lib\site-packages\paddleocr\ppocr\data\imaug\operators.py:181, in Resize.resize_image(self, img)
    [179](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:179) def resize_image(self, img):
    [180](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:180)     resize_h, resize_w = self.size
--> [181](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:181)     ori_h, ori_w = img.shape[:2]  # (h, w, c)
    [182](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:182)     ratio_h = float(resize_h) / ori_h
    [183](file:///E:/miniconda3/envs/paddle/lib/site-packages/paddleocr/ppocr/data/imaug/operators.py:183)     ratio_w = float(resize_w) / ori_w

AttributeError: 'list' object has no attribute 'shape'

可能解决方案 / Possible solutions

附件 / Appendix

感觉并不是文件的问题,如果实在需要将会提供

IeohMingChan commented 4 months ago

我也有相同问题,请问怎么解决的?

GreatV commented 4 months ago

@IeohMingChan 安装main分支,试试最新版

IeohMingChan commented 4 months ago

请问如何安装,我现在是使用pip来安装的paddleocr 2.7.5版本,python版本是3.8。如何改为安装main分支,是要下载源码吗?

GreatV commented 4 months ago

@IeohMingChan 试试 pip install git+https://github.com/PaddlePaddle/PaddleOCR.git

IeohMingChan commented 4 months ago

报错了:ERROR: Could not detect requirement name for 'git+https://github.com/PaddlePaddle/PaddleOCR.git', please specify one with #egg=your_package_name

IeohMingChan commented 4 months ago

GreatV commented 4 months ago

试试 pip install git+https://github.com/PaddlePaddle/PaddleOCR.git

IeohMingChan commented 4 months ago

PS C:\Users\Administrator\PycharmProjects\ocrEvaluation> pip install git+https://github.com/PaddlePaddle/PaddleOCR.git Collecting git+https://github.com/PaddlePaddle/PaddleOCR.git Cloning https://github.com/PaddlePaddle/PaddleOCR.git to c:\users\administrator\appdata\local\temp\pip-req-build-x565h9dl Running command git clone --filter=blob:none --quiet https://github.com/PaddlePaddle/PaddleOCR.git 'C:\Users\Administrator\AppData\Local\Temp\pip-req-build-x565h9dl' error: RPC failed; curl 18 HTTP/2 stream 3 was reset error: 135 bytes of body are still expected fetch-pack: unexpected disconnect while reading sideband packet fatal: early EOF fatal: fetch-pack: invalid index-pack output fatal: could not fetch fdc3c054c0be41a1bda613fc572af0d9cf6f3c13 from promisor remote warning: Clone succeeded, but checkout failed. You can inspect what was checked out with 'git status' and retry with 'git restore --source=HEAD :/'

error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/PaddlePaddle/PaddleOCR.git 'C:\Users\Administrator\AppData\Local\Temp\pip-req-build-x565h9dl' did not run successfully. │ exit code: 128 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/PaddlePaddle/PaddleOCR.git 'C:\Users\Administrator\AppData\Local\Temp\pip-req-build-x565h9dl' did not run successfully.
│ exit code: 128 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip. 是否要先卸载原有的paddleocr?

GreatV commented 4 months ago

你这个是网络不好

IeohMingChan commented 4 months ago

我在本地安装好了,感谢大佬