PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.57k stars 7.85k forks source link

Access to an undefined portion of a memory object #13712

Closed YiLin198 closed 2 months ago

YiLin198 commented 3 months ago

🔎 Search before asking

🐛 Bug (问题描述)

识别pdf的时候报错,但识别jpg的时候没有问题。用cpu运行的,没有gpu


C++ Traceback (most recent call last):

No stack trace in paddle, may be caused by external reasons.


Error Message Summary:

FatalError: Access to an undefined portion of a memory object is detected by the operating system. [TimeInfo: Aborted at 1724231265 (unix time) try "date -d @1724231265" if you are using GNU date ] [SignalInfo: SIGSEGV (@0x0) received by PID 43837 (TID 0x7f36481a5700) from PID 0 ]

🏃‍♂️ Environment (运行环境)

OS linux CentOS 7 8 CPU 32G,no gpu paddleocr 2.8.1 paddlepaddle 2.6.1 python 3.8.13

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

ocr = PaddleOCR(use_angle_cls=True, lang="ch", use_gpu=False) img_path = /path/to/pdf result = ocr.ocr(img_path, cls=True)

GreatV commented 3 months ago

无法复现,试试更新一下paddlepaddle版本

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang="ch", use_gpu=False)
img_path = "ppstructure/docs/recovery/UnrealText.pdf"
result = ocr.ocr(img_path, cls=True)