Closed cyicz123 closed 5 days ago
按照文档使用Docker构建时,会出现缺少yaml库而构建失败的错误。修改Dockerfile,安装PyYAML后,能够解决此问题。
RUN /bin/bash -c "pip3 install modelscope PyYAML && \ # 安装PyYAML库 wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py && \ python3 download_models.py && \ sed -i 's|cpu|cuda|g' /root/magic-pdf.json"
(base) ➜ MinerU docker build -t mineru:latest . [+] Building 398.4s (10/10) FINISHED docker:default => [internal] load build definition from Dockerfile 2.3s => => transferring dockerfile: 2.11kB 0.0s => [internal] load metadata for docker.io/library/ubuntu:22.04 32.5s => [internal] load .dockerignore 0.3s => => transferring context: 2B 0.0s => [1/7] FROM docker.io/library/ubuntu:22.04@sha256:0e5e4a57c2499249aafc3b40fcd541e9a456aab7296681a3994d631587203f97 0.0s => CACHED [2/7] RUN apt-get update && apt-get install -y software-properties-common && add-apt-repository ppa:deadsnakes/ppa && apt-get update && ap 0.0s => CACHED [3/7] RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1 0.0s => CACHED [4/7] RUN python3 -m venv /opt/mineru_venv 0.0s => CACHED [5/7] RUN /bin/bash -c "source /opt/mineru_venv/bin/activate && pip3 install --upgrade pip && wget https://gitee.com/myhloli/MinerU/raw/master/requirement 0.0s => CACHED [6/7] RUN /bin/bash -c "wget https://gitee.com/myhloli/MinerU/raw/master/magic-pdf.template.json && cp magic-pdf.template.json /root/magic-pdf.json && sou 0.0s => ERROR [7/7] RUN /bin/bash -c "pip3 install modelscope && wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py && python3 download_models 361.5s ------ > [7/7] RUN /bin/bash -c "pip3 install modelscope && wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py && python3 download_models.py && sed -i 's|cpu|cuda|g' /root/magic-pdf.json": 33.05 WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)")': /simple/modelscope/ 35.60 Collecting modelscope 37.59 Downloading modelscope-1.20.0-py3-none-any.whl (5.8 MB) 311.2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.8/5.8 MB 15.4 kB/s eta 0:00:00 313.0 Collecting urllib3>=1.26 313.1 Downloading urllib3-2.2.3-py3-none-any.whl (126 kB) 320.4 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.3/126.3 KB 19.3 kB/s eta 0:00:00 321.9 Collecting tqdm>=4.64.0 322.0 Downloading tqdm-4.67.0-py3-none-any.whl (78 kB) 326.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.6/78.6 KB 17.7 kB/s eta 0:00:00 327.4 Collecting requests>=2.25 327.5 Downloading requests-2.32.3-py3-none-any.whl (64 kB) 330.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 KB 24.8 kB/s eta 0:00:00 331.3 Collecting certifi>=2017.4.17 331.4 Downloading certifi-2024.8.30-py3-none-any.whl (167 kB) 339.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.3/167.3 KB 22.4 kB/s eta 0:00:00 339.3 Collecting idna<4,>=2.5 339.4 Downloading idna-3.10-py3-none-any.whl (70 kB) 344.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 KB 13.9 kB/s eta 0:00:00 347.3 Collecting charset-normalizer<4,>=2 347.4 Downloading charset_normalizer-3.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (144 kB) 353.2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.8/144.8 KB 25.3 kB/s eta 0:00:00 353.6 Installing collected packages: urllib3, tqdm, idna, charset-normalizer, certifi, requests, modelscope 359.3 Successfully installed certifi-2024.8.30 charset-normalizer-3.4.0 idna-3.10 modelscope-1.20.0 requests-2.32.3 tqdm-4.67.0 urllib3-2.2.3 359.3 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv 359.4 --2024-11-14 04:53:21-- https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py 359.4 Resolving gitee.com (gitee.com)... 180.76.198.225, 180.76.198.77 359.4 Connecting to gitee.com (gitee.com)|180.76.198.225|:443... connected. 359.5 HTTP request sent, awaiting response... 200 OK 359.7 Length: 1921 (1.9K) [text/plain] 359.7 Saving to: 'download_models.py' 359.7 359.7 0K . 100% 164M=0s 359.7 359.7 2024-11-14 04:53:21 (164 MB/s) - 'download_models.py' saved [1921/1921] 359.7 359.9 Traceback (most recent call last): 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 451, in _get_module 359.9 return importlib.import_module('.' + module_name, self.__name__) 359.9 File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module 359.9 return _bootstrap._gcd_import(name[level:], package, level) 359.9 File "<frozen importlib._bootstrap>", line 1050, in _gcd_import 359.9 File "<frozen importlib._bootstrap>", line 1027, in _find_and_load 359.9 File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked 359.9 File "<frozen importlib._bootstrap>", line 688, in _load_unlocked 359.9 File "<frozen importlib._bootstrap_external>", line 883, in exec_module 359.9 File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/fileio/io.py", line 8, in <module> 359.9 from .format import JsonHandler, YamlHandler 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/fileio/format/__init__.py", line 5, in <module> 359.9 from .yaml import YamlHandler 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/fileio/format/yaml.py", line 2, in <module> 359.9 import yaml 359.9 ModuleNotFoundError: No module named 'yaml' 359.9 359.9 The above exception was the direct cause of the following exception: 359.9 359.9 Traceback (most recent call last): 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 451, in _get_module 359.9 return importlib.import_module('.' + module_name, self.__name__) 359.9 File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module 359.9 return _bootstrap._gcd_import(name[level:], package, level) 359.9 File "<frozen importlib._bootstrap>", line 1050, in _gcd_import 359.9 File "<frozen importlib._bootstrap>", line 1027, in _find_and_load 359.9 File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked 359.9 File "<frozen importlib._bootstrap>", line 688, in _load_unlocked 359.9 File "<frozen importlib._bootstrap_external>", line 883, in exec_module 359.9 File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/hub/snapshot_download.py", line 11, in <module> 359.9 from modelscope.hub.api import HubApi, ModelScopeConfig 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/hub/api.py", line 26, in <module> 359.9 from modelscope.fileio import io 359.9 File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 432, in __getattr__ 359.9 value = self._get_module(name) 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 453, in _get_module 359.9 raise RuntimeError( 359.9 RuntimeError: Failed to import modelscope.fileio.io because of the following error (look up to see its traceback): 359.9 No module named 'yaml' 359.9 359.9 The above exception was the direct cause of the following exception: 359.9 359.9 Traceback (most recent call last): 359.9 File "//download_models.py", line 5, in <module> 359.9 from modelscope import snapshot_download 359.9 File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 434, in __getattr__ 359.9 module = self._get_module(self._class_to_module[name]) 359.9 File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 453, in _get_module 359.9 raise RuntimeError( 359.9 RuntimeError: Failed to import modelscope.hub.snapshot_download because of the following error (look up to see its traceback): 359.9 Failed to import modelscope.fileio.io because of the following error (look up to see its traceback): 359.9 No module named 'yaml' ------ Dockerfile:44 -------------------- 43 | # Download models and update the configuration file 44 | >>> RUN /bin/bash -c "pip3 install modelscope && \ 45 | >>> wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py && \ 46 | >>> python3 download_models.py && \ 47 | >>> sed -i 's|cpu|cuda|g' /root/magic-pdf.json" 48 | -------------------- ERROR: failed to solve: process "/bin/sh -c /bin/bash -c \"pip3 install modelscope && wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py && python3 download_models.py && sed -i 's|cpu|cuda|g' /root/magic-pdf.json\"" did not complete successfully: exit code: 1
Linux
3.10
0.9.x
cuda
复测确认是由于modelscope更新1.20.0版本加入了import yaml而没有更新requirements.txt导致的,可以临时通过指定modelscope版本为1.19.2或自行安装pyyaml解决。
import yaml
https://github.com/modelscope/modelscope/releases/tag/v1.20.1
fixed
Description of the bug | 错误描述
按照文档使用Docker构建时,会出现缺少yaml库而构建失败的错误。修改Dockerfile,安装PyYAML后,能够解决此问题。
How to reproduce the bug | 如何复现
Operating system | 操作系统
Linux
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.9.x
Device mode | 设备模式
cuda