opendatalab / MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
https://opendatalab.com/OpenSourceTools?tool=extract
GNU Affero General Public License v3.0
19.33k stars 1.38k forks source link

无法使用gpu进行加速 #1090

Closed hhr114 closed 4 days ago

hhr114 commented 4 days ago

Description of the bug | 错误描述

我的系统是centos7,所以下载的是magic-pdf[full,old_linux]版本,下载过程和cpu运行demo正常,在使用cuda加速时报了以下错误,将libcudnn_cnn_infer.so.8和libcudnn_ops_infer.so.8添加到LD_LIBRARY_PATH中也无济于事。

relocation error: /share/home.newer/hrhe/env/MinerU/lib/python3.10/site-packages/torch/lib/../../nvidia/cudnn/lib/libcudnn_cnn_infer.so.8: symbol _Z20traceback_iretf_implPKcRKN5cudnn16InternalStatus_tEb, version libcudnn_ops_infer.so.8 not defined in file libcudnn_ops_infer.so.8 with link time reference image

How to reproduce the bug | 如何复现

另外补充一点,我是用virtualenv创建的python3.10环境,安装过程没有报错

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.10.x

Device mode | 设备模式

cuda

myhloli commented 4 days ago

试下conda安装呢?

hhr114 commented 4 days ago

刚才用conda 重新走了一遍流程,还是一样的问题……

myhloli commented 4 days ago

paddlegpu装的是3.0.0b1吗?如果这个版本也不行的话就把paddle和paddlegpu都卸了,用cpu版的paddle吧

hhr114 commented 4 days ago

用gpu加速layout detection cost 和 mfr time,还没到用飞桨加速ocr那步,就已经报这个错了~

myhloli commented 4 days ago

把nvidia-smi 的结果贴一下

hhr114 commented 4 days ago

试了两台机器,我都贴在下面 image image

myhloli commented 4 days ago

开表格功能了吗?

hhr114 commented 4 days ago

没有,都是默认配置,只改了cuda

myhloli commented 4 days ago

把pip list 结果发一下

hhr114 commented 4 days ago

Package Version


absl-py 2.1.0 accelerate 1.1.1 aiohappyeyeballs 2.4.3 aiohttp 3.11.7 aiosignal 1.3.1 albucore 0.0.19 albumentations 1.4.20 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.6.2.post1 astor 0.8.1 async-timeout 5.0.1 attrdict 2.0.1 attrs 24.2.0 babel 2.16.0 bce-python-sdk 0.9.23 beautifulsoup4 4.12.3 black 24.10.0 blinker 1.9.0 boto3 1.35.69 botocore 1.35.69 braceexpand 0.1.7 Brotli 1.1.0 cachetools 5.5.0 certifi 2024.8.30 cffi 1.17.1 charset-normalizer 3.4.0 click 8.1.7 cloudpickle 3.1.0 coloredlogs 15.0.1 colorlog 6.9.0 contourpy 1.3.1 cryptography 43.0.3 cssselect 1.2.0 cssutils 2.11.1 cycler 0.12.1 Cython 3.0.11 datasets 3.1.0 decorator 5.1.1 detectron2 0.6 dill 0.3.8 doclayout_yolo 0.0.2 einops 0.8.0 et_xmlfile 2.0.0 eva-decord 0.6.1 eval_type_backport 0.2.0 evaluate 0.4.3 exceptiongroup 1.2.2 fairscale 0.4.13 fast-langdetect 0.2.0 fasttext-wheel 0.9.2 filelock 3.16.1 fire 0.7.0 Flask 3.1.0 flask-babel 4.0.0 flatbuffers 24.3.25 fonttools 4.55.0 frozenlist 1.5.0 fsspec 2024.9.0 ftfy 6.3.1 future 1.0.0 fvcore 0.1.5.post20221221 grpcio 1.68.0 h11 0.14.0 httpcore 1.0.7 httpx 0.27.2 huggingface-hub 0.26.2 humanfriendly 10.0 hydra-core 1.3.2 idna 3.10 imageio 2.36.0 imgaug 0.4.0 iopath 0.1.9 itsdangerous 2.2.0 Jinja2 3.1.4 jmespath 1.0.1 joblib 1.4.2 kiwisolver 1.4.7 langdetect 1.0.9 lazy_loader 0.4 lmdb 1.5.1 loguru 0.7.2 lxml 5.3.0 magic-pdf 0.10.1 Markdown 3.7 MarkupSafe 3.0.2 matplotlib 3.9.2 more-itertools 10.5.0 mpmath 1.3.0 multidict 6.1.0 multiprocess 0.70.16 mypy-extensions 1.0.0 networkx 3.4.2 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.6.85 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 onnxruntime 1.16.3 opencv-contrib-python 4.6.0.66 opencv-python 4.6.0.66 opencv-python-headless 4.10.0.84 openpyxl 3.1.5 opt-einsum 3.3.0 packaging 24.2 paddleocr 2.7.3 paddlepaddle 3.0.0b1 pandas 2.2.3 pathspec 0.12.1 pdf2docx 0.5.8 pdfminer.six 20231228 pillow 11.0.0 pip 24.2 platformdirs 4.3.6 portalocker 3.0.0 premailer 3.10.0 propcache 0.2.0 protobuf 5.28.3 psutil 6.1.0 py-cpuinfo 9.0.0 pyarrow 18.1.0 pybind11 2.13.6 pyclipper 1.3.0.post6 pycocotools 2.0.8 pycparser 2.22 pycryptodome 3.21.0 pydantic 2.7.4 pydantic_core 2.18.4 PyMuPDF 1.24.14 pyparsing 3.2.0 python-dateutil 2.9.0.post0 python-docx 1.1.2 pytz 2024.2 PyYAML 6.0.2 rapid-table 0.3.0 RapidFuzz 3.10.1 rapidocr-paddle 1.4.0 rarfile 4.2 regex 2024.11.6 requests 2.32.3 robust-downloader 0.0.2 s3transfer 0.10.4 safetensors 0.4.5 scikit-image 0.24.0 scikit-learn 1.5.2 scipy 1.14.1 seaborn 0.13.2 setuptools 75.1.0 shapely 2.0.6 six 1.16.0 sniffio 1.3.1 soupsieve 2.6 stringzilla 3.10.10 struct-eqtable 0.3.2 sympy 1.13.3 tabulate 0.9.0 tensorboard 2.18.0 tensorboard-data-server 0.7.2 termcolor 2.5.0 thop 0.1.1.post2209072238 threadpoolctl 3.5.0 tifffile 2024.9.20 timm 0.9.16 tokenizers 0.19.1 tomli 2.1.0 torch 2.3.1 torchtext 0.18.0 torchvision 0.18.1 tqdm 4.67.1 transformers 4.42.4 triton 2.3.1 typing_extensions 4.12.2 tzdata 2024.2 ultralytics 8.3.37 ultralytics-thop 2.0.12 unimernet 0.2.1 urllib3 2.2.3 visualdl 2.5.3 Wand 0.6.13 wcwidth 0.2.13 webdataset 0.2.100 Werkzeug 3.1.3 wheel 0.44.0 xxhash 3.5.0 yacs 0.1.8 yarl 1.18.0

myhloli commented 4 days ago

看样子是torch无法使用pip安装的cu12的环境,是不是你环境变量指向了系统安装的其他版本的cuda环境,建议清理下系统的环境变量

hhr114 commented 4 days ago

好的,我试试清理一下LIBRARY_PATH

hhr114 commented 4 days ago

这回可以了,感谢!棒!