opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
https://opendatalab.com/OpenSourceTools
GNU Affero General Public License v3.0
11.52k stars 864 forks source link

ImportError: libnccl.so.2: cannot open shared object file: No such file or directory #419

Open hzzheng0612 opened 1 month ago

hzzheng0612 commented 1 month ago

Description of the bug | 错误描述

fail to do the first run as suggested by step 8 in README_Ubuntu_CUDA_Acceleration_en_US.md

How to reproduce the bug | 如何复现

ImportError: libnccl.so.2: cannot open shared object file: No such file or directory
2024-08-13 17:07:33.256 | ERROR    | magic_pdf.model.pdf_extract_kit:<module>:28 - Required dependency not installed, please install by 
"pip install magic-pdf[full] --extra-index-url https://myhloli.github.io/wheels/"

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.7.0b1

Device mode | 设备模式

cuda

lianyant commented 1 month ago

(MinerU) llw@lianyan:~/github/marker/pdf_marker/workspace2/pdf$ magic-pdf -p small_ocr.pdf 2024-08-14 08:33:14.220 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 8, cid_chars_radio: 0.0 2024-08-14 08:33:14.221 | WARNING | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: False, by_avg_words: False, by_img_num: True, by_text_layout: False, by_img_narrow_strips: False, by_invalid_chars: True 2024-08-14 08:33:14.240 | ERROR | magic_pdf.model.pdf_extract_kit::27 - libnccl.so.2: cannot open shared object file: No such file or directory Traceback (most recent call last):

File "/home/llw/miniconda3/envs/MinerU/bin/magic-pdf", line 8, in sys.exit(cli()) │ │ └ │ └ └ <module 'sys' (built-in)> File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) │ │ │ └ {} │ │ └ () │ └ <function BaseCommand.main at 0x7d6fd1a1a200> └ File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) │ │ └ <click.core.Context object at 0x7d6fd1d56d40> │ └ <function Command.invoke at 0x7d6fd1a1acb0> └ File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) │ │ │ │ │ └ {'path': 'small_ocr.pdf', 'output_dir': '', 'method': 'auto'} │ │ │ │ └ <click.core.Context object at 0x7d6fd1d56d40> │ │ │ └ <function cli at 0x7d6f5f338700> │ │ └ │ └ <function Context.invoke at 0x7d6fd1a19a20> └ <click.core.Context object at 0x7d6fd1d56d40> File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(args, **kwargs) │ └ {'path': 'small_ocr.pdf', 'output_dir': '', 'method': 'auto'} └ () File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/tools/cli.py", line 75, in cli parse_doc(path) │ └ 'small_ocr.pdf' └ <function cli..parse_doc at 0x7d6fd1c33490> File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/tools/cli.py", line 60, in parse_doc do_parse( └ <function do_parse at 0x7d6f5f323be0> File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/tools/common.py", line 65, in do_parse pipe.pipe_analyze() │ └ <function UNIPipe.pipe_analyze at 0x7d6f5f323880> └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7d6f5f328d00> File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/pipe/UNIPipe.py", line 31, in pipe_analyze self.model_list = doc_analyze(self.pdf_bytes, ocr=True) │ │ │ │ └ b'%PDF-1.7\r\n%\xa1\xb3\xc5\xd7\r\n1 0 obj\r\n<</Pages 2 0 R /Type/Catalog>>\r\nendobj\r\n2 0 obj\r\n<</Count 8/Kids[ 4 0 R ... │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7d6f5f328d00> │ │ └ <function doc_analyze at 0x7d6fcc9d68c0> │ └ [] └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7d6f5f328d00> File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/model/doc_analyze_by_custom_model.py", line 109, in doc_analyze custom_model = model_manager.get_model(ocr, show_log) │ │ │ └ False │ │ └ True │ └ <function ModelSingleton.get_model at 0x7d6fcc9d6830> └ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x7d6f5ee28460> File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/model/doc_analyze_by_custom_model.py", line 63, in get_model self._models[key] = custom_model_init(ocr=ocr, show_log=show_log) │ │ │ │ │ └ False │ │ │ │ └ True │ │ │ └ <function custom_model_init at 0x7d6fcc9d6710> │ │ └ (True, False) │ └ {} └ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x7d6f5ee28460> File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/model/doc_analyze_by_custom_model.py", line 83, in custom_model_init from magic_pdf.model.pdf_extract_kit import CustomPEKModel File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed

File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/magic_pdf/model/pdf_extract_kit.py", line 13, in import torch File "/home/llw/miniconda3/envs/MinerU/lib/python3.10/site-packages/torch/init.py", line 239, in from torch._C import * # noqa: F403

ImportError: libnccl.so.2: cannot open shared object file: No such file or directory 2024-08-14 08:33:14.246 | ERROR | magic_pdf.model.pdf_extract_kit::28 - Required dependency not installed, please install by "pip install magic-pdf[full] --extra-index-url https://myhloli.github.io/wheels/"

同样的步骤同样的环境我也是报这个错误

myhloli commented 1 month ago

@hzzheng0612 @lianyant Have you installed NCCL? https://developer.nvidia.com/nccl

luxinfeng commented 1 month ago

Centos7 python3.10 CPU模式下也是这个报错,版本为0.7.0b1

luxinfeng commented 1 month ago

Centos7 python3.10 CPU模式下也是这个报错,版本为0.7.0b1

我这边排查后发现是缺失OpenGL这几个库导致的,通过yum -y install epel-release \ && yum -y install mesa-libGL mesa-libGLU libXtst libXrender 补充上这几个依赖后就可以正常运行了

lianyant commented 1 month ago

@hzzheng0612 @lianyant Have you installed NCCL? https://developer.nvidia.com/nccl

我安装了这个之后就正常了,非常感谢 @myhloli