Open RunshengZhu opened 17 hours ago
容器里的cuda版本和server无关,试试
docker run --rm --gpus=all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
看看能不能在docker中正常找到显卡
容器里的cuda版本和server无关,试试
docker run --rm --gpus=all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
看看能不能在docker中正常找到显卡
可以找到显卡,nvidia-smi看到的cuda版本是12.6. 容器里这个命令看到的是仍然宿主机的cuda版本,和toolkit无关 正在尝试其他基础镜像
nvidia smi里的cuda是最高支持的cuda版本,只要大于等于12.1即可,docker中使用的是pip引入的cuda环境,和server的cuda环境无关
对,容器中可以正常找到显卡,从nvidia-smi的输出中能看出是硬件配置的问题么,如果没有的话,有没有对dockerfile的修改可以避免这个问题?
---- 回复的原邮件 ---- | 发件人 | Xiaomeng @.> | | 日期 | 2024年12月04日 17:08 | | 收件人 | @.> | | 抄送至 | @.>@.> | | 主题 | Re: [opendatalab/MinerU] cuda 12.6上dockerfile已经无法正常工作了 (Issue #1188) |
nvidia smi里的cuda是最高支持的cuda版本,只要大于等于12.1即可,docker中使用的是pip引入的cuda环境,和server的cuda环境无关
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
在容器外也报了这个错误。但是nvidia-smi是能正常显示gpu的 import tensorrt_llm failed, if do not use tensorrt, ignore this message import lmdeploy failed, if do not use lmdeploy, ignore this message 2024-12-05 00:12:12.624 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars_by_pymupdf:84 - uffd_count: 0, text_len: 33376, u ffd_chars_radio: 0.0 2024-12-05 00:12:12.629 | INFO | magic_pdf.model.pdf_extract_kit:init:78 - DocAnalysis init, this may take some times, layou t_model: layoutlmv3, apply_formula: True, apply_ocr: False, apply_table: False, table_model: rapid_table, lang: None 2024-12-05 00:12:12.629 | INFO | magic_pdf.model.pdf_extract_kit:init:91 - using device: cuda 2024-12-05 00:12:12.630 | INFO | magic_pdf.model.pdf_extract_kit:init:95 - using models_dir: /home/ec2-user/.cache/huggingfa ce/hub/models--opendatalab--PDF-Extract-Kit-1.0/snapshots/38e484355b9acf5654030286bf72490e27842a3c/models CustomVisionEncoderDecoderModel init VariableUnimerNetModel init VariableUnimerNetPatchEmbeddings init VariableUnimerNetModel init VariableUnimerNetPatchEmbeddings init CustomMBartForCausalLM init CustomMBartDecoder init 2024-12-05 00:12:40.640 | ERROR | magic_pdf.tools.cli:parse_doc:108 - CUDA driver initialization failed, you might not have a CUD A gpu.
nvcc -V的输出如下: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Thu_Mar_28_02:18:24_PDT_2024 Cuda compilation tools, release 12.4, V12.4.131 Build cuda_12.4.r12.4/compiler.34097967_0
Description of the bug | 错误描述
按照文档说明,拉取最新的dockerfile成功build,但是进入容器处理pdf文件时,magic-pdf报错如下: /opt/mineru_venv/lib/python3.10/site-packages/paddle/base/framework.py:743: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default. warnings.warn( import tensorrt_llm failed, if do not use tensorrt, ignore this message import lmdeploy failed, if do not use lmdeploy, ignore this message 2024-12-04 07:16:35.273 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars_by_pymupdf:84 - uffd_count: 0, text_len: 38047, uffd_chars_radio: 0.0 2024-12-04 07:16:35.277 | INFO | magic_pdf.model.pdf_extract_kit:init:78 - DocAnalysis init, this may take some times, layout_model: layoutlmv3, apply_formula: True, apply_ocr: False, apply_table: False, table_model: rapid_table, lang: None 2024-12-04 07:16:35.277 | INFO | magic_pdf.model.pdf_extract_kit:init:91 - using device: cuda 2024-12-04 07:16:35.277 | INFO | magic_pdf.model.pdf_extract_kit:init:95 - using models_dir: /root/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit-1___0/models CustomVisionEncoderDecoderModel init VariableUnimerNetModel init VariableUnimerNetPatchEmbeddings init VariableUnimerNetModel init VariableUnimerNetPatchEmbeddings init CustomMBartForCausalLM init CustomMBartDecoder init 2024-12-04 07:16:46.208 | ERROR | magic_pdf.tools.cli:parse_doc:108 - CUDA driver initialization failed, you might not have a CUDA gpu.
How to reproduce the bug | 如何复现
在cuda 12.6的server上按照下面的步骤可以复现 wget https://github.com/opendatalab/MinerU/raw/master/Dockerfile docker build -t mineru:latest . docker run --rm -it --gpus=all mineru:latest /bin/bash magic-pdf --help magic-pdf -p xxx.pdf -o output
Operating system | 操作系统
Linux
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.10.x
Device mode | 设备模式
cuda