opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
https://mineru.readthedocs.io/
GNU Affero General Public License v3.0
13.75k stars 1.03k forks source link

DLL load failed while importing _c_internal_utils: 找不到指定的模块。 #317

Closed shmiluyu closed 3 months ago

shmiluyu commented 3 months ago

Description of the bug | 错误描述

严格安装步骤安装了所有环境. 再次pip install magic-pdf[full]==0.6.2b1,会显示所有依赖都已经正常. 但是运行demo的命令转换就提示 2024-08-04 11:49:53.272 | ERROR | magic_pdf.model.pdf_extract_kit::24 - DLL load failed while importing _c_internal_utils: 找不到指定的模块。

pip list Package Version


absl-py 2.1.0 aiohappyeyeballs 2.3.4 aiohttp 3.10.0 aiosignal 1.3.1 albucore 0.0.13 albumentations 1.4.12 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.4.0 astor 0.8.1 async-timeout 4.0.3 attrdict 2.0.1 attrs 24.1.0 Babel 2.15.0 bce-python-sdk 0.9.19 beautifulsoup4 4.12.3 black 24.8.0 blinker 1.8.2 boto3 1.34.153 botocore 1.34.153 braceexpand 0.1.7 Brotli 1.1.0 cachetools 5.4.0 certifi 2024.7.4 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 colorama 0.4.6 colorlog 6.8.2 contourpy 1.2.1 cryptography 43.0.0 cssselect 1.2.0 cssutils 2.11.1 cycler 0.12.1 Cython 3.0.10 datasets 2.20.0 decorator 5.1.1 detectron2 0.6 dill 0.3.8 et-xmlfile 1.1.0 eva-decord 0.6.1 eval_type_backport 0.2.0 evaluate 0.4.2 exceptiongroup 1.2.2 fairscale 0.4.13 fast-langdetect 0.2.0 fasttext-wheel 0.9.2 filelock 3.15.4 fire 0.6.0 Flask 3.0.3 flask-babel 4.0.0 fonttools 4.53.1 frozenlist 1.4.1 fsspec 2024.5.0 ftfy 6.2.0 future 1.0.0 fvcore 0.1.5.post20221221 grpcio 1.65.4 h11 0.14.0 httpcore 1.0.5 httpx 0.27.0 huggingface-hub 0.24.5 hydra-core 1.3.2 idna 3.7 imageio 2.34.2 imgaug 0.4.0 intel-openmp 2021.4.0 iopath 0.1.9 itsdangerous 2.2.0 Jinja2 3.1.4 jmespath 1.0.1 joblib 1.4.2 kiwisolver 1.4.5 langdetect 1.0.9 lazy_loader 0.4 lmdb 1.5.1 loguru 0.7.2 lxml 5.2.2 magic-pdf 0.6.2b1 Markdown 3.6 MarkupSafe 2.1.5 matplotlib 3.9.1 mkl 2021.4.0 more-itertools 10.3.0 mpmath 1.3.0 multidict 6.0.5 multiprocess 0.70.16 mypy-extensions 1.0.0 networkx 3.3 numpy 1.26.4 omegaconf 2.3.0 opencv-contrib-python 4.6.0.66 opencv-python 4.6.0.66 opencv-python-headless 4.10.0.84 openpyxl 3.1.5 opt-einsum 3.3.0 packaging 24.1 paddleocr 2.7.3 paddlepaddle 2.6.1 pandas 2.2.2 pathspec 0.12.1 pdf2docx 0.5.8 pdfminer.six 20231228 pillow 10.4.0 pip 24.0 platformdirs 4.2.2 portalocker 2.10.1 premailer 3.10.0 protobuf 3.20.2 psutil 6.0.0 py-cpuinfo 9.0.0 pyarrow 17.0.0 pyarrow-hotfix 0.6 pybind11 2.13.1 pyclipper 1.3.0.post5 pycocotools 2.0.8 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.8.2 pydantic_core 2.20.1 PyMuPDF 1.24.9 PyMuPDFb 1.24.9 pyparsing 3.1.2 python-dateutil 2.9.0.post0 python-docx 1.1.2 pytz 2024.1 pywin32 306 PyYAML 6.0.1 rapidfuzz 3.9.5 rarfile 4.2 regex 2024.7.24 requests 2.32.3 robust-downloader 0.0.2 s3transfer 0.10.2 safetensors 0.4.3 scikit-image 0.24.0 scikit-learn 1.5.1 scipy 1.14.0 seaborn 0.13.2 setuptools 69.5.1 shapely 2.0.5 six 1.16.0 sniffio 1.3.1 soupsieve 2.5 sympy 1.13.1 tabulate 0.9.0 tbb 2021.13.0 tensorboard 2.17.0 tensorboard-data-server 0.7.2 termcolor 2.4.0 threadpoolctl 3.5.0 tifffile 2024.7.24 timm 0.9.16 tokenizers 0.19.1 tomli 2.0.1 torch 2.3.1 torchtext 0.18.0 torchvision 0.18.1 tqdm 4.66.5 transformers 4.40.0 typing_extensions 4.12.2 tzdata 2024.1 ultralytics 8.2.72 ultralytics-thop 2.0.0 unimernet 0.1.6 urllib3 2.2.2 visualdl 2.5.3 Wand 0.6.13 wcwidth 0.2.13 webdataset 0.2.86 Werkzeug 3.0.3 wheel 0.43.0 win32-setctime 1.1.0 wordninja 2.0.0 xxhash 3.4.1 yacs 0.1.8 yarl 1.9.4

How to reproduce the bug | 如何复现

magic-pdf pdf-command --pdf "D:/magic-pdf/6105137170.pdf" --inside_model true

Operating system | 操作系统

Windows

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.6.x

Device mode | 设备模式

cpu

myhloli commented 3 months ago

Can you provide more stack trace information from the error?

shmiluyu commented 3 months ago

Can you provide more stack trace information from the error?

` 2024-08-04 14:10:08.146 | WARNING | magic_pdf.cli.magicpdf:get_model_json:312 - not found json D:/work/github/magic-pdf/6105137170.json existed 2024-08-04 14:10:08.146 | WARNING | magic_pdf.libs.config_reader:get_local_dir:64 - 'temp-output-dir' not found in magic-pdf.json, use '/tmp' as default 2024-08-04 14:10:09.351 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 7728, cid_chars_radio: 0.0 2024-08-04 14:10:12.121 | ERROR | magic_pdf.model.pdf_extract_kit::24 - DLL load failed while importing _c_internal_utils: 找不到指定的模块。 Traceback (most recent call last):

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, │ │ └ {'name': 'main', 'doc': None, 'package': '', 'loader': <zipimporter object "D:\dev-stuff\scoop\apps\anaco... │ └ <code object at 0x00000259B68E05B0, file "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\Scripts\magic-pd... └ <function _run_code at 0x00000259B68BE050>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\runpy.py", line 86, in _run_code exec(code, run_globals) │ └ {'name': 'main', 'doc': None, 'package': '', 'loader': <zipimporter object "D:\dev-stuff\scoop\apps\anaco... └ <code object at 0x00000259B68E05B0, file "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\Scripts\magic-pd...

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\Scripts\magic-pdf.exe__main__.py", line 7, in sys.exit(cli()) │ │ └ │ └ └ <module 'sys' (built-in)>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\click\core.py", line 1157, in call return self.main(*args, **kwargs) │ │ │ └ {} │ │ └ () │ └ <function BaseCommand.main at 0x00000259B6D2B640> └

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) │ │ └ <click.core.Context object at 0x00000259B6928FA0> │ └ <function MultiCommand.invoke at 0x00000259B6D3C670> └

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\click\core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) │ │ │ │ └ <click.core.Context object at 0x000002598C6CF2E0> │ │ │ └ <function Command.invoke at 0x00000259B6D3C160> │ │ └ │ └ <click.core.Context object at 0x000002598C6CF2E0> └ <function MultiCommand.invoke.._process_result at 0x00000259B668C5E0>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) │ │ │ │ │ └ {'pdf': 'D:/work/github/magic-pdf/6105137170.pdf', 'inside_model': True, 'model': None, 'method': 'auto', 'model_mode': 'full'} │ │ │ │ └ <click.core.Context object at 0x000002598C6CF2E0> │ │ │ └ <function pdf_command at 0x000002598C6F23B0> │ │ └ │ └ <function Context.invoke at 0x00000259B6D2AE60> └ <click.core.Context object at 0x000002598C6CF2E0>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\click\core.py", line 783, in invoke return __callback(*args, **kwargs) │ └ {'pdf': 'D:/work/github/magic-pdf/6105137170.pdf', 'inside_model': True, 'model': None, 'method': 'auto', 'model_mode': 'full'} └ ()

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 352, in pdf_command parse_doc(pdf) │ └ 'D:/work/github/magic-pdf/6105137170.pdf' └ <function pdf_command..parse_doc at 0x000002598C6F20E0>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 330, in parse_doc do_parse( └ <function do_parse at 0x000002598C6F1CF0>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 111, in do_parse pipe.pipe_analyze() │ └ <function UNIPipe.pipe_analyze at 0x000002598C6F0DC0> └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x000002598C6CEE60>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 29, in pipe_analyze self.model_list = doc_analyze(self.pdf_bytes, ocr=False) │ │ │ │ └ b'%PDF-1.3\n%\xe2\xe3\xcf\xd3\n11 0 obj\r<<\r/Length 12 0 R\r/Filter [ /FlateDecode ]\r>>\rstream\nx\x9c\xbdZ\xcd\xabeG\x11... │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x000002598C6CEE60> │ │ └ <function doc_analyze at 0x00000259B9480D30> │ └ [] └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x000002598C6CEE60>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 103, in doc_analyze custom_model = model_manager.get_model(ocr, show_log) │ │ │ └ False │ │ └ False │ └ <function ModelSingleton.get_model at 0x00000259B9480CA0> └ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x000002598C6CF7F0>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 63, in get_model self._models[key] = custom_model_init(ocr=ocr, show_log=show_log) │ │ │ │ │ └ False │ │ │ │ └ False │ │ │ └ <function custom_model_init at 0x00000259B9480B80> │ │ └ (False, False) │ └ {} └ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x000002598C6CF7F0>

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 83, in custom_model_init from magic_pdf.model.pdf_extract_kit import CustomPEKModel

File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 18, in from ultralytics import YOLO

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics__init__.py", line 10, in from ultralytics.data.explorer.explorer import Explorer

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics\data__init__.py", line 3, in from .base import BaseDataset

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics\data\base.py", line 17, in from ultralytics.data.utils import FORMATS_HELP_MSG, HELP_URL, IMG_FORMATS

File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics\data\utils.py", line 19, in

from ultralytics.nn.autobackend import check_class_names File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics\nn\__init__.py", line 3, in from .tasks import ( File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics\nn\tasks.py", line 10, in from ultralytics.nn.modules import ( File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics\nn\modules\__init__.py", line 20, in from .block import ( File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics\nn\modules\block.py", line 8, in from ultralytics.utils.torch_utils import fuse_conv_and_bn File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\ultralytics\utils\__init__.py", line 21, in import matplotlib.pyplot as plt File "D:\dev-stuff\scoop\apps\anaconda3\current\App\envs\MinerU\lib\site-packages\matplotlib\__init__.py", line 159, in from . import _api, _version, cbook, _docstring, rcsetup │ └ from matplotlib import _api, _c_internal_utils └ :25 - Required dependency not installed, please install by "pip install magic-pdf[full] detectron2 --extra-index-url https://myhloli.github.io/wheels/" `
myhloli commented 3 months ago

看着像matplotlib没装好,卸了重装试试呢

FigureLean commented 3 months ago

问题一样的,卸载了也不行 magic-pdf pdf-command --pdf "E:\PDF-Extract-Kit\PDF-Extract-Kit\demo\模拟试卷.pdf" --inside_model true 2024-08-04 17:08:38.161 | WARNING | magic_pdf.cli.magicpdf:get_model_json:312 - not found json E:\PDF-Extract-Kit\PDF-Extract-Kit\demo\模拟试卷.json existed 2024-08-04 17:08:38.161 | WARNING | magic_pdf.libs.config_reader:get_local_dir:64 - 'temp-output-dir' not found in magic-pdf.json, use '/tmp' as default 2024-08-04 17:08:38.514 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 2119, cid_chars_radio: 0.0 2024-08-04 17:08:40.725 | ERROR | magic_pdf.model.pdf_extract_kit::24 - DLL load failed while importing _c_internal_utils: 找不到指定的模块。 Traceback (most recent call last):

File "E:\tensflow\MinerU\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, │ │ └ {'name': 'main', 'doc': None, 'package': '', 'loader': <zipimporter object "E:\tensflow\MinerU\Scripts\ma... │ └ <code object at 0x00000257BDB07EC0, file "E:\tensflow\MinerU\Scripts\magic-pdf.exe__main__.py", line 1> └ <function _run_code at 0x00000257BDAF0D30>

File "E:\tensflow\MinerU\lib\runpy.py", line 86, in _run_code exec(code, run_globals) │ └ {'name': 'main', 'doc': None, 'package': '', 'loader': <zipimporter object "E:\tensflow\MinerU\Scripts\ma... └ <code object at 0x00000257BDB07EC0, file "E:\tensflow\MinerU\Scripts\magic-pdf.exe__main__.py", line 1>

File "E:\tensflow\MinerU\Scripts\magic-pdf.exe__main__.py", line 7, in sys.exit(cli()) │ │ └ │ └ └ <module 'sys' (built-in)>

File "E:\tensflow\MinerU\lib\site-packages\click\core.py", line 1157, in call return self.main(*args, **kwargs) │ │ │ └ {} │ │ └ () │ └ <function BaseCommand.main at 0x00000257BDF5D750> └

File "E:\tensflow\MinerU\lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) │ │ └ <click.core.Context object at 0x00000257BDB59000> │ └ <function MultiCommand.invoke at 0x00000257BDF5E710> └

File "E:\tensflow\MinerU\lib\site-packages\click\core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) │ │ │ │ └ <click.core.Context object at 0x00000257FF8733D0> │ │ │ └ <function Command.invoke at 0x00000257BDF5E200> │ │ └ │ └ <click.core.Context object at 0x00000257FF8733D0> └ <function MultiCommand.invoke.._process_result at 0x00000257BD87F880>

File "E:\tensflow\MinerU\lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) │ │ │ │ │ └ {'pdf': 'E:\PDF-Extract-Kit\PDF-Extract-Kit\demo\模拟试卷.pdf', 'inside_model': True, 'model': None, 'method': 'auto', 'model... │ │ │ │ └ <click.core.Context object at 0x00000257FF8733D0> │ │ │ └ <function pdf_command at 0x00000257FF898160> │ │ └ │ └ <function Context.invoke at 0x00000257BDF5CF70> └ <click.core.Context object at 0x00000257FF8733D0>

File "E:\tensflow\MinerU\lib\site-packages\click\core.py", line 783, in invoke return __callback(*args, **kwargs) │ └ {'pdf': 'E:\PDF-Extract-Kit\PDF-Extract-Kit\demo\模拟试卷.pdf', 'inside_model': True, 'model': None, 'method': 'auto', 'model... └ ()

File "E:\tensflow\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 352, in pdf_command parse_doc(pdf) │ └ 'E:\PDF-Extract-Kit\PDF-Extract-Kit\demo\模拟试卷.pdf' └ <function pdf_command..parse_doc at 0x00000257FF88BE20>

File "E:\tensflow\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 330, in parse_doc do_parse( └ <function do_parse at 0x00000257FF88BA30>

File "E:\tensflow\MinerU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 111, in do_parse pipe.pipe_analyze() │ └ <function UNIPipe.pipe_analyze at 0x00000257FF88A7A0> └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000257FF872F50>

File "E:\tensflow\MinerU\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 29, in pipe_analyze self.model_list = doc_analyze(self.pdf_bytes, ocr=False) │ │ │ │ └ b'%PDF-1.7\r\n%\xb5\xb5\xb5\xb5\r\n1 0 obj\r\n<</Type/Catalog/Pages 2 0 R/Lang(zh-CN) /StructTreeRoot 35 0 R/MarkInfo<</Marke... │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000257FF872F50> │ │ └ <function doc_analyze at 0x00000257C06B6EF0> │ └ [] └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000257FF872F50>

File "E:\tensflow\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 103, in doc_analyze custom_model = model_manager.get_model(ocr, show_log) │ │ │ └ False │ │ └ False │ └ <function ModelSingleton.get_model at 0x00000257C06B6E60> └ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x00000257FF983A60>

File "E:\tensflow\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 63, in get_model self._models[key] = custom_model_init(ocr=ocr, show_log=show_log) │ │ │ │ │ └ False │ │ │ │ └ False │ │ │ └ <function custom_model_init at 0x00000257C06B6D40> │ │ └ (False, False) │ └ {} └ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x00000257FF983A60>

File "E:\tensflow\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 83, in custom_model_init from magic_pdf.model.pdf_extract_kit import CustomPEKModel

File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed

File "E:\tensflow\MinerU\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 18, in from ultralytics import YOLO

File "E:\tensflow\MinerU\lib\site-packages\ultralytics__init__.py", line 10, in from ultralytics.data.explorer.explorer import Explorer

File "E:\tensflow\MinerU\lib\site-packages\ultralytics\data__init__.py", line 3, in from .base import BaseDataset

File "E:\tensflow\MinerU\lib\site-packages\ultralytics\data\base.py", line 17, in from ultralytics.data.utils import FORMATS_HELP_MSG, HELP_URL, IMG_FORMATS

File "E:\tensflow\MinerU\lib\site-packages\ultralytics\data\utils.py", line 19, in from ultralytics.nn.autobackend import check_class_names

File "E:\tensflow\MinerU\lib\site-packages\ultralytics\nn__init__.py", line 3, in from .tasks import (

File "E:\tensflow\MinerU\lib\site-packages\ultralytics\nn\tasks.py", line 10, in from ultralytics.nn.modules import (

File "E:\tensflow\MinerU\lib\site-packages\ultralytics\nn\modules__init__.py", line 20, in from .block import (

File "E:\tensflow\MinerU\lib\site-packages\ultralytics\nn\modules\block.py", line 8, in from ultralytics.utils.torch_utils import fuse_conv_and_bn

File "E:\tensflow\MinerU\lib\site-packages\ultralytics\utils__init__.py", line 21, in import matplotlib.pyplot as plt

File "E:\tensflow\MinerU\lib\site-packages\matplotlib__init.py", line 159, in from . import _api, _version, cbook, _docstring, rcsetup │ └ <module 'matplotlib._version' from 'E:\tensflow\MinerU\lib\site-packages\matplotlib\_version.py'> └ <module 'matplotlib._api' from 'E:\tensflow\MinerU\lib\site-packages\matplotlib\_api\init__.py'>

File "E:\tensflow\MinerU\lib\site-packages\matplotlib\cbook.py", line 32, in from matplotlib import _api, _c_internal_utils └ <module 'matplotlib' from 'E:\tensflow\MinerU\lib\site-packages\matplotlib\init.py'>

ImportError: DLL load failed while importing _c_internal_utils: 找不到指定的模块。 2024-08-04 17:08:40.735 | ERROR | magic_pdf.model.pdf_extract_kit::25 - Required dependency not installed, please install by "pip install magic-pdf[full] detectron2 --extra-index-url https://myhloli.github.io/wheels/"

pkaho commented 3 months ago

这个可能是 _c_internal_utils在matplotlib高版本被弃用了(或者别的什么原因),降低matplotlib版本就好了,比如matplotlib=3.7.5

FigureLean commented 3 months ago

笑川大佐太nb了

myhloli commented 3 months ago

这个可能是 _c_internal_utils在matplotlib高版本被弃用了(或者别的什么原因),降低matplotlib版本就好了,比如matplotlib=3.7.5

有点奇怪的是,matplotlib在7月就更新3.9.1了,最近一周我们做了全新环境的安装兼容测试,没有测试出这个问题😂

shmiluyu commented 3 months ago

这个可能是 _c_internal_utils在matplotlib高版本被弃用了(或者别的什么原因),降低matplotlib版本就好了,比如matplotlib=3.7.5

降到了matplotlib-3.8.4,错误消失.非常感谢

myhloli commented 3 months ago

查看了一下我的本地开发环境,matplotlib是3.9.1版本,

image

和官方最新发布的版本一致

image

去https://github.com/matplotlib/matplotlib 查了下,_c_internal_utils 在最新的代码中是存在的,不应该出现这种import错误

image

这个情况跟之前遇到的另一个库import失败的情况有点像,报错提示

ImportError: DLL load failed while importing _c_internal_utils: 找不到指定的模块。

可能不是找不到_c_internal_utils 模块,而是在 _c_internal_utils 内部出现错误,需要从本地加载某个dll库的时候发生错误,而这种情况的发生很可能是matplotlib的安装过程中某个需要加载的dll库没有正确释放到正确的路径导致,这时一般卸载相关库再重新安装可以解决。

dll库没有释放到正确路径的原因有很多,有些时候被杀毒软件误识别成木马或病毒被静默删除的情况也会导致该问题。

mmaatthhss commented 3 months ago

相同的问题,降低matplotlib版本的确解决了

pkaho commented 3 months ago

查看了一下我的本地开发环境,matplotlib是3.9.1版本,

image

和官方最新发布的版本一致 image

去https://github.com/matplotlib/matplotlib 查了下,_c_internal_utils 在最新的代码中是存在的,不应该出现这种import错误

确实,这个扩展是存在的。官方的说法是要重新编译或缺少了MSVC Redistribute

myhloli commented 3 months ago

查看了一下我的本地开发环境,matplotlib是3.9.1版本,

image

和官方最新发布的版本一致 image 去https://github.com/matplotlib/matplotlib 查了下,_c_internal_utils 在最新的代码中是存在的,不应该出现这种import错误

确实,这个扩展是存在的。官方的说法是要重新编译或缺少了MSVC Redistribute

确实,我又去pypi上查看了matplotlib的release记录,在3.9.0版本及之前均提供了windows版本的预编译包,而在3.9.1版本则只提供了linux和macos的预编译包,那么在一些没有MSVC 编译环境的windows设备上安装3.9.1版本会自动通过源码编译安装,而在安装过程中很可能没有提供有效的编译失败提示,导致在这部分设备上显示正常安装了,但是并没有编译出需要加载的dll资源,也就导致了这部分windows设备出现matplotlib库的import失败。 后续我们会将matplotlib的版本锁定在3.9.0之前,防止在这些windows设备上安装失败。

myhloli commented 3 months ago

https://github.com/opendatalab/MinerU/commit/9ececf3a1ec53db36ea05ac9016160dcc49182fd https://github.com/opendatalab/MinerU/commit/252139099b3f3ccab9045ebad9aa4ba5182b86c4

fixed