PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.3k stars 5.62k forks source link

[BUG] 2.6 加载PaddleOCRV4官方模型 SIGILL 错误, 退回paddlepaddle-2.5.2正常. #60477

Open gowy222 opened 11 months ago

gowy222 commented 11 months ago

bug描述 Describe the Bug

参考 https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/quickstart.md 想体验一下的。 docker里面Linux环境,各种发行版本,py版本3.8-3.10都试过了...

pip install --no-cache-dir paddlepaddle paddleocr

安装验证通过的: `#7 50.04 I1229 18:18:25.484820 874 interpretercore.cc:237] New Executor is Running.

7 50.07 I1229 18:18:25.518586 874 interpreter_util.cc:518] Standalone Executor is Used.

7 50.08 Running verify PaddlePaddle program ...

7 50.08 PaddlePaddle works well on 1 CPU.

7 50.08 PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.`

然后成功, app.py 初始化 ocr = PaddleOCR(use_gpu=False,lang="ch") 会自动下载官方模型:

download https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar to /root/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer/ch_PP-OCRv4_det_infer.tar
100%|██████████| 4.89M/4.89M [00:01<00:00, 2.72MiB/s]
download https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_rec_infer.tar to /root/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.tar
100%|██████████| 11.0M/11.0M [00:02<00:00, 4.51MiB/s]
download https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar to /root/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.tar
100%|██████████| 2.19M/2.19M [00:01<00:00, 1.46MiB/s]

紧接着 加载模型就报错! (任何配置都报错,cls开不开不影响报错) `-------------------------------------- C++ Traceback (most recent call last):

0 paddle_infer::Predictor::Predictor(paddle::AnalysisConfig const&) 1 std::unique_ptr<paddle::PaddlePredictor, std::default_delete > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&) 2 paddle::AnalysisPredictor::Init(std::shared_ptr const&, std::shared_ptr const&) 3 paddle::AnalysisPredictor::PrepareProgram(std::shared_ptr const&) 4 paddle::AnalysisPredictor::OptimizeInferenceProgram() 5 paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument) 6 paddle::inference::analysis::IrAnalysisPass::RunImpl(paddle::inference::analysis::Argument) 7 paddle::inference::analysis::IRPassManager::Apply(std::unique_ptr<paddle::framework::ir::Graph, std::default_delete >) 8 paddle::framework::ir::Pass::Apply(paddle::framework::ir::Graph) const 9 paddle::framework::ir::SelfAttentionFusePass::ApplyImpl(paddle::framework::ir::Graph) const 10 paddle::framework::ir::GraphPatternDetector::operator()(paddle::framework::ir::Graph, std::function<void (std::map<paddle::framework::ir::PDNode, paddle::framework::ir::Node, paddle::framework::ir::GraphPatternDetector::PDNodeCompare, std::allocator<std::pair<paddle::framework::ir::PDNode const, paddle::framework::ir::Node> > > const&, paddle::framework::ir::Graph)>)


Error Message Summary:

FatalError: Illegal instruction is detected by the operating system. [TimeInfo: Aborted at 1703826924 (unix time) try "date -d @1703826924" if you are using GNU date ] [SignalInfo: SIGILL (@0x7f421eaa186a) received by PID 1 (TID 0x7f422642b740) from PID 514463850 ]`

其他补充信息 Additional Supplementary Information

No response

tink2123 commented 11 months ago

请问安装的Paddle whl包是cuda几呢? 我们尝试复现一下

gowy222 commented 11 months ago

请问安装的Paddle whl包是cuda几呢? 我们尝试复现一下

云服务器...纯CPU版本... 没用任何GPU相关

直接pip install --no-cache-dir paddlepaddle paddleocr

tink2123 commented 11 months ago

纯cpu没有复现问题,请问预测命令是这个吗:

paddleocr --image_dir=doc/imgs/1.jpg --use_gpu=False
gowy222 commented 11 months ago

纯cpu没有复现问题,请问预测命令是这个吗:

paddleocr --image_dir=doc/imgs/1.jpg --use_gpu=False

docker里面测试跑的

FROM python:3.10-slim-bullseye
ENV TZ=Asia/Shanghai
ENV DEBIAN_FRONTEND=noninteractive
COPY app.py /

apt-get update
apt-get install -y libgomp1
pip install --no-cache-dir paddlepaddle paddleocr

app.py代码是复制参考的 https://pypi.org/project/paddleocr/ PaddleOCR 依赖 PaddlePaddle

from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True,use_gpu=False,lang="ch")
result = ocr.ocr(local_file_path, det=True, rec=True, cls=True)

ocr = PaddleOCR(use_angle_cls=True,use_gpu=False,lang="ch") 这行初始化报错: ` ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir=None, page_num=0, det_algorithm='DB', det_model_dir='/root/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/root/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/usr/local/lib/python3.10/dist-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=True, cls_model_dir='/root/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', ocr_version='PP-OCRv4', structure_version='PP-StructureV2')


C++ Traceback (most recent call last):

0 paddle_infer::Predictor::Predictor(paddle::AnalysisConfig const&) 1 std::unique_ptr<paddle::PaddlePredictor, std::default_delete > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&) 2 paddle::AnalysisPredictor::Init(std::shared_ptr const&, std::shared_ptr const&) 3 paddle::AnalysisPredictor::PrepareProgram(std::shared_ptr const&) 4 paddle::AnalysisPredictor::OptimizeInferenceProgram() 5 paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument) 6 paddle::inference::analysis::IrAnalysisPass::RunImpl(paddle::inference::analysis::Argument) 7 paddle::inference::analysis::IRPassManager::Apply(std::unique_ptr<paddle::framework::ir::Graph, std::default_delete >) 8 paddle::framework::ir::Pass::Apply(paddle::framework::ir::Graph) const 9 paddle::framework::ir::SelfAttentionFusePass::ApplyImpl(paddle::framework::ir::Graph) const 10 paddle::framework::ir::GraphPatternDetector::operator()(paddle::framework::ir::Graph, std::function<void (std::map<paddle::framework::ir::PDNode, paddle::framework::ir::Node, paddle::framework::ir::GraphPatternDetector::PDNodeCompare, std::allocator<std::pair<paddle::framework::ir::PDNode const, paddle::framework::ir::Node> > > const&, paddle::framework::ir::Graph)>)


Error Message Summary:

FatalError: Illegal instruction is detected by the operating system. [TimeInfo: Aborted at 1704261377 (unix time) try "date -d @1704261377" if you are using GNU date ] [SignalInfo: SIGILL (@0x7f4e1c49386a) received by PID 1 (TID 0x7f4e23e1d740) from PID 474560618 ]`

cuicheng01 commented 11 months ago

我们这边cuda11.7+paddle2.6没有问题,麻烦给下更具体的环境信息和测试命令呢?

gowy222 commented 11 months ago

我们这边cuda11.7+paddle2.6没有问题,麻烦给下更具体的环境信息和测试命令呢?

纯cpu所以不装cuda...云服务器本来就没有显卡...

FROM python:3.10-slim-bullseye ENV CUDA_VISIBLE_DEVICES -1 #环境层禁用GPU

环境信息:

CPU Architecture: CPU Model: AMD EPYC 7K62 48-Core Processor CPU Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca GPU Details: No NVIDIA GPU detected GCC Version: No GCC installed GLIBC Version: ldd (Debian GLIBC 2.31-13+deb11u7) 2.31

pip 安装了哪些包:

anyio 4.2.0 astor 0.8.1 attrdict 2.0.1 Babel 2.14.0 bce-python-sdk 0.8.98 beautifulsoup4 4.12.2 blinker 1.7.0 cachetools 5.3.2 certifi 2023.11.17 charset-normalizer 3.3.2 click 8.1.7 contourpy 1.2.0 cssselect 1.2.0 cssutils 2.9.0 cycler 0.12.1 Cython 3.0.7 decorator 5.1.1 et-xmlfile 1.1.0 exceptiongroup 1.2.0 fire 0.5.0 Flask 3.0.0 flask-babel 4.0.0 fonttools 4.47.0 future 0.18.3 h11 0.14.0 httpcore 1.0.2 httpx 0.26.0 idna 3.6 imageio 2.33.1 imgaug 0.4.0 itsdangerous 2.1.2 Jinja2 3.1.2 kiwisolver 1.4.5 lazy_loader 0.3 lmdb 1.4.1 lxml 5.0.0 MarkupSafe 2.1.3 matplotlib 3.8.2 networkx 3.2.1 numpy 1.26.3 opencv-contrib-python 4.6.0.66 opencv-python 4.6.0.66 openpyxl 3.1.2 opt-einsum 3.3.0 packaging 23.2 paddleocr 2.7.0.3 paddlepaddle 2.6.0 pandas 2.1.4 pdf2docx 0.5.6 pillow 10.2.0 pip 23.3.2 premailer 3.10.0 protobuf 4.25.1 psutil 5.9.7 pyclipper 1.3.0.post5 pycryptodome 3.19.1 PyMuPDF 1.20.2 pyparsing 3.1.1 python-dateutil 2.8.2 python-docx 1.1.0 pytz 2023.3.post1 PyYAML 6.0.1 rapidfuzz 3.6.1 rarfile 4.1 requests 2.31.0 scikit-image 0.22.0 scipy 1.11.4 setuptools 65.5.1 shapely 2.0.2 six 1.16.0 sniffio 1.3.0 soupsieve 2.5 termcolor 2.4.0 tifffile 2023.12.9 tqdm 4.66.1 typing_extensions 4.9.0 tzdata 2023.4 urllib3 2.1.0 visualdl 2.5.3 Werkzeug 3.0.1 wheel 0.42.0

cuicheng01 commented 11 months ago

AMD 的CPU确实可能会存在一些问题,这个需要我们反馈看下

nigue3025 commented 11 months ago

Got the same error with cuda11.8 and Ubuntu20.0.4(with cpu intel i7-13700) while upgrading to 2.6.0 from 2.5.2

OttomanZ commented 10 months ago

I had the same issue, pip install --no-cache-dir paddlepaddle==2.5.1 paddleocr==2.7.0.3 this fixed it.

jamesdull commented 10 months ago

Looks like an AVX512 instruction snuck into the paddlepaddle==2.6.0 build. Here's the problematic instruction according to gdb:

>0x7f48dcdde86a      vmovss (%rax),%xmm16

VMOVSS using an xmm16 register

polym commented 10 months ago

Thanks @OttomanZ. But pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3 solved my problem, according to https://github.com/PaddlePaddle/Paddle/issues/57493

coderLinJ5945 commented 10 months ago

百度,真让人失望,用一下就出现这个BUG,真是扶不起的阿斗!

lmyzd commented 10 months ago

I had the same issue,intel cpu ubuntu x86

yiranzai commented 10 months ago

Thanks @OttomanZ. But pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3 solved my problem, according to #57493

LGTM. I had the same issue,intel cpu manjaro x86. only cpu.

taibaimoyu commented 10 months ago

I had the same issue, pip install --no-cache-dir paddlepaddle==2.5.1 paddleocr==2.7.0.3 this fixed it.

nice answer

GuodongQi commented 10 months ago

same here cup version 2.6

xiaofeicn commented 9 months ago

这个有处理方法吗,2.5.2虽然可以但是推理起来比2.6.0慢好多

fanxing-6 commented 9 months ago

I had the same issue, pip install --no-cache-dir paddlepaddle==2.5.1 paddleocr==2.7.0.3 this fixed it.

感谢 这个版本可以解决这个问题 Thank you, this version can solve the problem.

qq70571382 commented 9 months ago

可以使用paddle 2.6.0 但是 paddleocr 要用 PP-OCRv3, PP-OCRv4 有问题, 2.5.2 太慢了

qq70571382 commented 9 months ago

ocr_object = PaddleOCR(use_angle_cls=True, lang="ch", enable_mkldnn=False,ocr_version='PP-OCRv3') # 中文

eritpchy commented 8 months ago

同样问题 纯cpu版, 2.6.0就这样, 2.5.2可以

simonejiang7 commented 8 months ago

I had the same issue, pip install --no-cache-dir paddlepaddle==2.5.1 paddleocr==2.7.0.3 this fixed it

For me, python 3.10 is not compatible but python 3.7 is working with this setup.

cole-dda commented 8 months ago

可以使用paddle 2.6.0 但是 paddleocr 要用 PP-OCRv3, PP-OCRv4 有问题, 几个月了,还没有解决问题吗?

ubuntu 18.04 x64 intel cpu

最新发布的2.6.1 也不行,百度有人来解决issue的吗?

使用paddlepaddle==2.5.2 来运行v4解析同一个图片需要20秒,v3解析之需要2秒,差距太多。

zainzhoucom commented 8 months ago

The same problem is waiting to be solved

hello2mao commented 7 months ago

+1

pip install --no-cache-dir paddlepaddle-gpu==2.5.2 paddleocr==2.7.0.3 works fine.

da2vin commented 7 months ago

请问这个问题什么时候能解决呀

ZhangGaoxing commented 7 months ago

The same problem in Docker environment

caicaicai commented 7 months ago

Thanks @OttomanZ. But pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3 solved my problem, according to #57493

可以是可以了,但是推理时间慢了一倍。。。

StuckInLoop commented 7 months ago

this worked "pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3"
but how come this problem seems to still be there after almost half a year!? why isnt it fixed by default!?

Hisir0909 commented 4 months ago

这个命令有效:“pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3”, 但是为什么这个问题在将近半年后似乎仍然存在!?为什么它不是默认修复的!?

似乎是因为高版本使用了avx512加速,如果你的cpu支持avx512应该就没问题(

OttomanZ commented 4 months ago

@StuckInLoop I came up with this solution a year ago, and still people are facing this, but on newer Ubuntu 22.04 LTS releases I have installed, I didn't need to use this workaround. So it does look like they fixed it kinda, but for others this is still an issue.

Hisir0909 commented 4 months ago

@OttomanZ Are you currently using a CPU that supports avx512? In version 2.6, I tested two CPUs, an amd 4500u without avx512 support and a 7840hs with avx5125 support. The result is that the 7840hs runs fine, while the 4500u fails. However, this problem disappears in 3.0 (if using cpu inference), But it still exists in 3.0 if gpu acceleration is used.