opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
https://opendatalab.com/OpenSourceTools
GNU Affero General Public License v3.0
11.38k stars 852 forks source link

AssertionError: Torch not compiled with CUDA enabled #240

Closed tuhang closed 1 month ago

tuhang commented 1 month ago

Description of the bug | 错误描述

Yesterday, the deployment of the CPU was completed. Today, an attempt was made to deploy the GPU, but some problems were encountered.

How to reproduce the bug | 如何复现

I cloned the conda environment of MinerU and then ran PyTorch corresponding to 12.4

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124

CUDA

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:28:36_Pacific_Standard_Time_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0

Configuration file

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "temp-output-dir":"F:/tmp",
    "models-dir":"F:/1_models/PDF-Extract-Kit/models",
    "device-mode":"cuda"
}

Operation error reporting

2024-07-30 01:19:05.604 | INFO     | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 10, cid_chars_radio: 0.0
2024-07-30 01:19:05.604 | WARNING  | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: True, by_avg_words: False, by_img_num: True, by_text_layout: True, by_img_narrow_strips: True, by_invalid_chars: True
[2024-07-30 01:19:14,201] [   ERROR] check_version.py:39 - Error fetching version info
Traceback (most recent call last):
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1283, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1329, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1038, in _send_output
    self.send(msg)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 976, in send
    self.connect()
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1455, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\ssl.py", line 1104, in _create
    self.do_handshake()
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\ssl.py", line 1375, in do_handshake
    self._sslobj.do_handshake()
TimeoutError: _ssl.c:990: The handshake operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\albumentations\check_version.py", line 29, in fetch_version_info
    with opener.open(url, timeout=2) as response:
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 519, in open
    response = self._open(req, data)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 496, in _call_chain
    result = func(*args)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error _ssl.c:990: The handshake operation timed out>
2024-07-30 01:19:16.135 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:92 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: True
2024-07-30 01:19:16.135 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:100 - using device: cuda
CustomVisionEncoderDecoderModel init
CustomMBartForCausalLM init
CustomMBartDecoder init
Traceback (most recent call last):
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\tu_ha\.conda\envs\MinerU_GPU\Scripts\magic-pdf.exe\__main__.py", line 7, in <module>
    sys.exit(cli())
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 325, in pdf_command
    do_parse(
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 111, in do_parse
    pipe.pipe_analyze()
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 31, in pipe_analyze
    self.model_list = doc_analyze(self.pdf_bytes, ocr=True)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 69, in doc_analyze
    custom_model = CustomPEKModel(ocr=ocr, show_log=show_log, models_dir=local_models_dir, device=device)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 111, in __init__
    self.mfr_model, mfr_vis_processors = mfr_model_init(mfr_weight_dir, mfr_cfg_path, _device_=self.device)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 41, in mfr_model_init
    model = model.to(_device_)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 1173, in to
    return self._apply(convert)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply
    module._apply(fn)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply
    module._apply(fn)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 804, in _apply
    param_applied = fn(param)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 1159, in convert
    return t.to(
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\cuda\__init__.py", line 284, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

image

Operating system | 操作系统

Windows

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.6.x

Device mode | 设备模式

cuda

tuhang commented 1 month ago

pip list

(MinerU_GPU) C:\Users\tu_ha>pip list
Package                   Version
------------------------- ------------------
absl-py                   2.1.0
aiohttp                   3.9.5
aiosignal                 1.3.1
albucore                  0.0.12
albumentations            1.4.12
altair                    5.3.0
annotated-types           0.7.0
antlr4-python3-runtime    4.9.3
anyio                     4.4.0
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
astor                     0.8.1
asttokens                 2.4.1
async-lru                 2.0.4
async-timeout             4.0.3
attrdict                  2.0.1
attrs                     23.2.0
Babel                     2.15.0
bce-python-sdk            0.9.17
beautifulsoup4            4.12.3
black                     24.4.2
bleach                    6.1.0
blinker                   1.8.2
boto3                     1.34.149
botocore                  1.34.149
braceexpand               0.1.7
Brotli                    1.1.0
cachetools                5.4.0
certifi                   2024.7.4
cffi                      1.16.0
charset-normalizer        3.3.2
click                     8.1.7
cloudpickle               3.0.0
colorama                  0.4.6
colorlog                  6.8.2
comm                      0.2.2
contourpy                 1.2.1
cryptography              43.0.0
cssselect                 1.2.0
cssutils                  2.11.1
cycler                    0.12.1
Cython                    3.0.10
datasets                  2.20.0
debugpy                   1.8.2
decorator                 5.1.1
defusedxml                0.7.1
detectron2                0.6
dill                      0.3.8
et-xmlfile                1.1.0
eva-decord                0.6.1
eval_type_backport        0.2.0
evaluate                  0.4.2
exceptiongroup            1.2.2
executing                 2.0.1
fairscale                 0.4.13
fast-langdetect           0.2.1
fastjsonschema            2.20.0
fasttext-wheel            0.9.2
filelock                  3.15.4
fire                      0.6.0
Flask                     3.0.3
flask-babel               4.0.0
fonttools                 4.53.1
fqdn                      1.5.1
frozenlist                1.4.1
fsspec                    2024.5.0
ftfy                      6.2.0
future                    1.0.0
fvcore                    0.1.5.post20221221
gitdb                     4.0.11
GitPython                 3.1.43
grpcio                    1.64.1
h11                       0.14.0
httpcore                  1.0.5
httpx                     0.27.0
huggingface-hub           0.24.2
hydra-core                1.3.2
idna                      3.7
imageio                   2.34.2
imgaug                    0.4.0
intel-openmp              2021.4.0
iopath                    0.1.9
ipykernel                 6.29.5
ipython                   8.26.0
isoduration               20.11.0
itsdangerous              2.2.0
jedi                      0.19.1
Jinja2                    3.1.4
jmespath                  1.0.1
joblib                    1.4.2
json5                     0.9.25
jsonpointer               3.0.0
jsonschema                4.23.0
jsonschema-specifications 2023.12.1
jupyter_client            8.6.2
jupyter_core              5.7.2
jupyter-events            0.10.0
jupyter-lsp               2.2.5
jupyter_server            2.14.2
jupyter_server_terminals  0.5.3
jupyterlab                4.2.4
jupyterlab_pygments       0.3.0
jupyterlab_server         2.27.3
kiwisolver                1.4.5
lazy_loader               0.4
lmdb                      1.5.1
loguru                    0.7.2
lxml                      5.2.2
magic-pdf                 0.6.1
Markdown                  3.6
markdown-it-py            3.0.0
MarkupSafe                2.1.5
matplotlib                3.9.1
matplotlib-inline         0.1.7
mdurl                     0.1.2
mistune                   3.0.2
mkl                       2021.4.0
more-itertools            10.3.0
mpmath                    1.3.0
multidict                 6.0.5
multiprocess              0.70.16
mypy-extensions           1.0.0
nbclient                  0.10.0
nbconvert                 7.16.4
nbformat                  5.10.4
nest-asyncio              1.6.0
networkx                  3.3
nltk                      3.8.1
notebook_shim             0.2.4
numpy                     1.26.4
omegaconf                 2.3.0
opencv-contrib-python     4.6.0.66
opencv-python             4.6.0.66
opencv-python-headless    4.10.0.84
openpyxl                  3.1.5
opt-einsum                3.3.0
overrides                 7.7.0
packaging                 24.1
paddleocr                 2.7.3
paddlepaddle              2.6.1
pandas                    2.2.2
pandocfilters             1.5.1
parso                     0.8.4
pathspec                  0.12.1
pdf2docx                  0.5.8
pdf2image                 1.17.0
pdfminer.six              20240706
pillow                    10.4.0
pip                       24.0
platformdirs              4.2.2
portalocker               2.10.1
premailer                 3.10.0
prometheus_client         0.20.0
prompt_toolkit            3.0.47
protobuf                  3.20.2
psutil                    6.0.0
pure_eval                 0.2.3
py-cpuinfo                9.0.0
pyarrow                   17.0.0
pyarrow-hotfix            0.6
pybind11                  2.13.1
pyclipper                 1.3.0.post5
pycocotools               2.0.8
pycparser                 2.22
pycryptodome              3.20.0
pydantic                  2.8.2
pydantic_core             2.20.1
pydeck                    0.9.1
Pygments                  2.18.0
PyMuPDF                   1.24.9
PyMuPDFb                  1.24.9
pyparsing                 3.1.2
pypdfium2                 4.30.0
python-dateutil           2.9.0.post0
python-docx               1.1.2
python-json-logger        2.0.7
pytz                      2024.1
pywin32                   306
pywinpty                  2.0.13
PyYAML                    6.0.1
pyzmq                     26.0.3
rapidfuzz                 3.9.4
rarfile                   4.2
referencing               0.35.1
regex                     2024.7.24
requests                  2.32.3
rfc3339-validator         0.1.4
rfc3986-validator         0.1.1
rich                      13.7.1
robust-downloader         0.0.2
rpds-py                   0.19.1
s3transfer                0.10.2
safetensors               0.4.3
scikit-image              0.24.0
scikit-learn              1.5.1
scipy                     1.14.0
seaborn                   0.13.2
Send2Trash                1.8.3
setuptools                69.5.1
shapely                   2.0.5
six                       1.16.0
smmap                     5.0.1
sniffio                   1.3.1
soupsieve                 2.5
stack-data                0.6.3
streamlit                 1.37.0
streamlit-drawable-canvas 0.9.3
sympy                     1.13.1
tabulate                  0.9.0
tbb                       2021.13.0
tenacity                  8.5.0
tensorboard               2.17.0
tensorboard-data-server   0.7.2
termcolor                 2.4.0
terminado                 0.18.1
threadpoolctl             3.5.0
tifffile                  2024.7.24
timm                      0.9.16
tinycss2                  1.3.0
tokenizers                0.19.1
toml                      0.10.2
tomli                     2.0.1
toolz                     0.12.1
torch                     2.4.0+cu124
torchtext                 0.18.0
torchvision               0.19.0+cu124
tornado                   6.4.1
tqdm                      4.66.4
traitlets                 5.14.3
transformers              4.40.0
types-python-dateutil     2.9.0.20240316
typing_extensions         4.12.2
tzdata                    2024.1
ultralytics               8.2.68
ultralytics-thop          2.0.0
unimernet                 0.1.1
uri-template              1.3.0
urllib3                   2.2.2
visualdl                  2.5.3
Wand                      0.6.13
watchdog                  4.0.1
wcwidth                   0.2.13
webcolors                 24.6.0
webdataset                0.2.86
webencodings              0.5.1
websocket-client          1.8.0
Werkzeug                  3.0.3
wheel                     0.43.0
win32-setctime            1.1.0
wordninja                 2.0.0
xxhash                    3.4.1
yacs                      0.1.8
yarl                      1.9.4
tuhang commented 1 month ago

An attempt was made to reference PyTorch using Python, but there was an error in importing the package. I suspect it's a problem with the package again.

(MinerU_GPU) C:\Users\tu_ha>python
Python 3.10.14 | packaged by Anaconda, Inc. | (main, May  6 2024, 19:44:50) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\tu_ha\.conda\envs\MinerU_GPU\lib\site-packages\torch\__init__.py", line 148, in <module>
    raise err
OSError: [WinError 126] 找不到指定的模块。 Error loading "C:\Users\tu_ha\.conda\envs\MinerU_GPU\lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies.
>>>
tuhang commented 1 month ago

I saw the control of the version of torch in other issues ,The same error was reported despite the attempt.

(MinerU_GPU) C:\Users\tu_ha>pip install torch==2.3.1 torchvision==0.18.1
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Collecting torch==2.3.1
  Using cached https://mirrors.aliyun.com/pypi/packages/85/fc/ee5bb50eff313149657f173b003649677e27fa3aaae1ecc806add37f017c/torch-2.3.1-cp310-cp310-win_amd64.whl (159.8 MB)
Collecting torchvision==0.18.1
  Using cached https://mirrors.aliyun.com/pypi/packages/4e/62/3816637079b77875077678bd7087285a5b5589664f94f5ceb2d080cc024c/torchvision-0.18.1-cp310-cp310-win_amd64.whl (1.2 MB)
Requirement already satisfied: filelock in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torch==2.3.1) (3.15.4)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torch==2.3.1) (4.12.2)
Requirement already satisfied: sympy in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torch==2.3.1) (1.13.1)
Requirement already satisfied: networkx in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torch==2.3.1) (3.3)
Requirement already satisfied: jinja2 in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torch==2.3.1) (3.1.4)
Requirement already satisfied: fsspec in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torch==2.3.1) (2024.5.0)
Requirement already satisfied: mkl<=2021.4.0,>=2021.1.1 in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torch==2.3.1) (2021.4.0)
Requirement already satisfied: numpy in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torchvision==0.18.1) (1.26.4)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from torchvision==0.18.1) (10.4.0)
Requirement already satisfied: intel-openmp==2021.* in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from mkl<=2021.4.0,>=2021.1.1->torch==2.3.1) (2021.4.0)
Requirement already satisfied: tbb==2021.* in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from mkl<=2021.4.0,>=2021.1.1->torch==2.3.1) (2021.13.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from jinja2->torch==2.3.1) (2.1.5)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\users\tu_ha\.conda\envs\mineru_gpu\lib\site-packages (from sympy->torch==2.3.1) (1.3.0)
Installing collected packages: torch, torchvision
Successfully installed torch-2.3.1 torchvision-0.18.1

pip list

torch                     2.3.1
torchtext                 0.18.0
torchvision               0.18.1

The same error still exists.

2024-07-30 01:34:44.452 | INFO     | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 291, cid_chars_radio: 0.0
2024-07-30 01:34:44.452 | WARNING  | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: True, by_avg_words: False, by_img_num: True, by_text_layout: True, by_img_narrow_strips: True, by_invalid_chars: True
[2024-07-30 01:34:52,748] [   ERROR] check_version.py:39 - Error fetching version info
Traceback (most recent call last):
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1283, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1329, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1038, in _send_output
    self.send(msg)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 976, in send
    self.connect()
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\http\client.py", line 1455, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\ssl.py", line 1104, in _create
    self.do_handshake()
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\ssl.py", line 1375, in do_handshake
    self._sslobj.do_handshake()
TimeoutError: _ssl.c:990: The handshake operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\albumentations\check_version.py", line 29, in fetch_version_info
    with opener.open(url, timeout=2) as response:
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 519, in open
    response = self._open(req, data)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 496, in _call_chain
    result = func(*args)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\urllib\request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error _ssl.c:990: The handshake operation timed out>
2024-07-30 01:34:54.724 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:92 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: True
2024-07-30 01:34:54.724 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:100 - using device: cuda
CustomVisionEncoderDecoderModel init
CustomMBartForCausalLM init
CustomMBartDecoder init
Traceback (most recent call last):
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\tu_ha\.conda\envs\MinerU_GPU\Scripts\magic-pdf.exe\__main__.py", line 7, in <module>
    sys.exit(cli())
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 325, in pdf_command
    do_parse(
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\cli\magicpdf.py", line 111, in do_parse
    pipe.pipe_analyze()
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 31, in pipe_analyze
    self.model_list = doc_analyze(self.pdf_bytes, ocr=True)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 69, in doc_analyze
    custom_model = CustomPEKModel(ocr=ocr, show_log=show_log, models_dir=local_models_dir, device=device)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 111, in __init__
    self.mfr_model, mfr_vis_processors = mfr_model_init(mfr_weight_dir, mfr_cfg_path, _device_=self.device)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 41, in mfr_model_init
    model = model.to(_device_)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 1173, in to
    return self._apply(convert)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply
    module._apply(fn)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply
    module._apply(fn)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 804, in _apply
    param_applied = fn(param)
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\nn\modules\module.py", line 1159, in convert
    return t.to(
  File "C:\Users\tu_ha\.conda\envs\MinerU_CPU\lib\site-packages\torch\cuda\__init__.py", line 284, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
tuhang commented 1 month ago

After modifying the version, torch is available, but cuda is not available

(MinerU_GPU) C:\Users\tu_ha>python
Python 3.10.14 | packaged by Anaconda, Inc. | (main, May  6 2024, 19:44:50) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.cuda.is_available()) 
False
>>> print(torch.cuda.device_count())
0
>>> print(torch.version.cuda) 
None
myhloli commented 1 month ago

torch 2.3.1 is our support latest version. please use

pip install torch==2.3.1 torchvision==0.18.1 --index-url https://download.pytorch.org/whl/cu118

install torch with cuda.

tuhang commented 1 month ago

Is it necessary to downgrade CUDA to 118?

myhloli commented 1 month ago

Is it necessary to downgrade CUDA to 118?

If you want to use cuda accelerate both pytorch and paddlepaddle,cu11.8 is the only choice on windows. I test on Ubuntu22.04 use torch with cu12 and paddlepaddle with cu11 work well,but on windows they must use same version of cuda.

tuhang commented 1 month ago

After I adjusted to the following dependencies, the GPU was available, and its efficiency was much higher than that of the CPU.

torch                     2.3.1+cu118
torchtext                 0.18.0
torchvision               0.18.1+cu118

It took me six hours one night to complete conda clone, conda and pip install, 2.3.1+cpu (the pitfall of torch for CPU), and I experienced all the pitfalls of cuda version dependencies. The conclusion is that the dependencies must be installed in accordance with the requirements of the readme. Thank you for replying so late.