[Bug] ImportError: DLL load failed while importing _turbomind: 找不到指定的模块。

StarCycle commented 2 months ago

Checklist

[ ] 1. I have searched related issues but cannot get the expected help.
[ ] 2. The bug has not been fixed in the latest version.

Describe the bug

pip直接安装，因为triton的原因，最后自动装上了lmdeploy 0.2.2

Reproduction

PS G:\Python-3.9.12\Lib\site-packages\lmdeploy> lmdeploy serve api_server internlm/internlm-xcomposer2-vl-1_8b --server-port 23333
Traceback (most recent call last):
  File "G:\Python-3.9.12\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "G:\Python-3.9.12\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "G:\Python-3.9.12\Scripts\lmdeploy.exe\__main__.py", line 7, in <module>
    sys.exit(run())
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\entrypoint.py", line 18, in run
    args.run(args)
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\serve.py", line 248, in api_server
    run_api_server(args.model_path,
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\openai\api_server.py", line 994, in serve
    VariableInterface.async_engine = AsyncEngine(
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 67, in __init__
    self._build_turbomind(model_path=model_path,
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 107, in _build_turbomind
    from lmdeploy import turbomind as tm
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\__init__.py", line 24, in <module>
    from .turbomind import TurboMind  # noqa: E402
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\turbomind.py", line 26, in <module>
    from .deploy.converter import (get_model_format, supported_formats,
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\converter.py", line 16, in <module>
    from .target_model.base import OUTPUT_MODELS, TurbomindModelConfig
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\target_model\__init__.py", line 3, in <module>
    from .w4 import TurbomindW4Model  # noqa: F401
  File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\target_model\w4.py", line 17, in <module>
    import _turbomind as _tm  # noqa: E402
ImportError: DLL load failed while importing _turbomind: 找不到指定的模块。

Environment

win11

PS G:\Python-3.9.12\Lib\site-packages\lmdeploy> lmdeploy check_env
sys.platform: win32
Python: 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)]
CUDA available: False
MUSA available: False
numpy_random_seed: 2147483648
GCC: n/a
PyTorch: 2.3.0+cpu
PyTorch compiling details: PyTorch built with:
  - C++ Version: 201703
  - MSVC 192930151
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX512
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.0, USE_CUDA=0, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

TorchVision: 0.18.0+cpu
LMDeploy: 0.2.2+
transformers: 4.40.1
gradio: Not Found
fastapi: 0.110.2
pydantic: 2.7.1

环境变量 ··· PS G:\Python-3.9.12\Lib\site-packages\lmdeploy> echo $env:PATH C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\libnvvp;G:\Python-3.9.12\Scripts\;G:\Python-3.9.12\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\WINDOWS\System32\OpenSSH\;C:\Program Files\Git\cmd;C:\Program Files\NVIDIA Corporation\Nsight Compute 2022.3.0\;C:\Users\marti\AppData\Local\Microsoft\WindowsApps;G:\Microsoft VS Code\bin ···

nvidia-smi

PS G:\Python-3.9.12\Lib\site-packages\lmdeploy> nvidia-smi
Tue Apr 30 12:32:00 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 522.06       Driver Version: 522.06       CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:01:00.0 Off |                  N/A |
| N/A   57C    P0    30W /  N/A |      6MiB /  8192MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+



### Error traceback

_No response_

StarCycle commented 2 months ago

重装cuda到12.4，重装pytorch还是不行

··· PS C:\Users\marti> lmdeploy check_env sys.platform: win32 Python: 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)] CUDA available: True MUSA available: False numpy_random_seed: 2147483648 GPU 0: NVIDIA GeForce RTX 3070 Laptop GPU CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4 NVCC: Cuda compilation tools, release 12.4, V12.4.131 GCC: n/a PyTorch: 2.3.0+cu121 PyTorch compiling details: PyTorch built with:

C++ Version: 201703
MSVC 192930151
Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
OpenMP 2019
LAPACK is enabled (usually provided by MKL)
CPU capability usage: AVX512
CUDA Runtime 12.1
NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
CuDNN 8.8.1 (built against CUDA 12.0)
Magma 2.5.4
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.8.1, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

TorchVision: 0.18.0+cpu LMDeploy: 0.2.2+ transformers: 4.40.1 gradio: Not Found fastapi: 0.110.2 pydantic: 2.7.1 ···

报错仍然是：ImportError: DLL load failed while importing _turbomind: 找不到指定的模块。

@irexyc @lvhan028 两位前辈，顺带一提，lmdeploy什么时候可以像xtuner一样有一个自己的交流群呀（而不是使用internlm2的...

lvhan028 commented 2 months ago

@vansin 帮忙建立 lmdeploy 专门的用户群吧

irexyc commented 2 months ago

LMDeploy 装最新的吧，别装0.2.2了。

你的显卡驱驱动支持到 11.8，所以你不能装 pypi 上面的包（因为那个是用cuda12编译的)，除非你升级显卡驱动。

新建一个虚拟环境

pip install https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp39-cp39-win_amd64.whl --extra-index-url https://download.pytorch.org/whl/cu118

然后再看看能不能运行，不能运行的话，启动的时候应该会print 一条信息，分享一下。

EdwardBuck1 commented 2 months ago

C:\Users\marti>pip install https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp39-cp39-win_amd64.whl --extra-index-url https://download.pytorch.org/whl/cu118 Defaulting to user installation because normal site-packages is not writeable Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu118 ERROR: lmdeploy-0.4.0+cu118-cp39-cp39-win_amd64.whl is not a supported wheel on this platform.报错了

irexyc commented 2 months ago

你看下你python的版本，选择一个对应的。建议你用conda

这个地址是3.9的。 https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp39-cp39-win_amd64.whl

EdwardBuck1 commented 2 months ago

谢谢大佬

EdwardBuck1 commented 2 months ago

C:\Users\marti>pip install https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp311-cp311-win_amd64.whl Defaulting to user installation because normal site-packages is not writeable Collecting lmdeploy==0.4.0+cu118 Downloading https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp311-cp311-win_amd64.whl (52.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 52.7/52.7 MB 19.9 MB/s eta 0:00:00 Collecting einops (from lmdeploy==0.4.0+cu118) Using cached einops-0.8.0-py3-none-any.whl.metadata (12 kB) Requirement already satisfied: fastapi in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.110.3) Requirement already satisfied: fire in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.6.0) Requirement already satisfied: mmengine-lite in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.10.4) Requirement already satisfied: numpy in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (1.26.4) Collecting peft<=0.9.0 (from lmdeploy==0.4.0+cu118) Using cached peft-0.9.0-py3-none-any.whl.metadata (13 kB) Collecting pillow (from lmdeploy==0.4.0+cu118) Using cached pillow-10.3.0-cp311-cp311-win_amd64.whl.metadata (9.4 kB) Requirement already satisfied: protobuf in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (4.25.3) Requirement already satisfied: pydantic>2.0.0 in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (2.7.1) Requirement already satisfied: pynvml in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (11.5.0) Requirement already satisfied: safetensors in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.4.3) Requirement already satisfied: sentencepiece in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.2.0) Requirement already satisfied: shortuuid in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (1.0.13) Requirement already satisfied: tiktoken in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.6.0) Collecting torch<=2.2.2,>=2.0.0 (from lmdeploy==0.4.0+cu118) Using cached torch-2.2.2-cp311-cp311-win_amd64.whl.metadata (26 kB) Requirement already satisfied: transformers in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (4.40.1) INFO: pip is looking at multiple versions of lmdeploy to determine which version is compatible with other requirements. This could take a while. ERROR: Could not find a version that satisfies the requirement triton<=2.2.0,>=2.1.0 (from lmdeploy) (from versions: none) ERROR: No matching distribution found for triton<=2.2.0,>=2.1.0为什么还是这样啊

irexyc commented 2 months ago

triton 不支持 windows，你可以这么装。

# step 1, install lmdeploy without deps
pip install https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp311-cp311-win_amd64.whl --no-deps

# step 2, install the deps (exclude triton)
# 新建一个文件 requirements.txt
# 内容为
einops
fastapi
fire
mmengine-lite
numpy
peft<=0.9.0
pillow
protobuf
pydantic>2.0.0
pynvml
safetensors
sentencepiece
shortuuid
tiktoken
torch<=2.2.2,>=2.0.0
transformers
# triton>=2.1.0,<=2.2.0
uvicorn

pip install -r .\requirements.txt --extra-index-url https://download.pytorch.org/whl/cu118

EdwardBuck1 commented 2 months ago

只能成功了，太强了大佬

EdwardBuck1 commented 2 months ago

0.4.0安装成功了可是为什么运行这个lmdeploy serve api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 23333后返还的结果还是C:\Users\marti>lmdeploy serve api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 23333 Traceback (most recent call last): File "G:\Python-3.9.12\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "G:\Python-3.9.12\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "G:\Python-3.9.12\Scripts\lmdeploy.exe__main.py", line 7, in sys.exit(run()) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\entrypoint.py", line 18, in run args.run(args) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\serve.py", line 248, in api_server run_api_server(args.model_path, File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\openai\api_server.py", line 994, in serve VariableInterface.async_engine = AsyncEngine( File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 67, in init self._build_turbomind(model_path=model_path, File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 107, in _build_turbomind from lmdeploy import turbomind as tm File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\init.py", line 24, in from .turbomind import TurboMind # noqa: E402 File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\turbomind.py", line 26, in from .deploy.converter import (get_model_format, supported_formats, File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\converter.py", line 16, in from .target_model.base import OUTPUT_MODELS, TurbomindModelConfig File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\target_model\init__.py", line 3, in from .w4 import TurbomindW4Model # noqa: F401 File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\target_model\w4.py", line 17, in import _turbomind as _tm # noqa: E402 ImportError: DLL load failed while importing _turbomind: 找不到指定的模块。我崩溃了

irexyc commented 2 months ago

代码有点问题，该打印的信息没打印出来。

你改下这个文件 G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\__init__.py

# 这一行
if os.path.exists(os.path.join(pwd, 'lib')):

# 改成
if os.path.exists(os.path.join(pwd, '..', 'lib')):

EdwardBuck1 commented 2 months ago

C:\Users\marti>lmdeploy serve api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 23333 Traceback (most recent call last): File "G:\Python-3.9.12\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "G:\Python-3.9.12\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "G:\Python-3.9.12\Scripts\lmdeploy.exe__main.py", line 7, in sys.exit(run()) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\entrypoint.py", line 18, in run args.run(args) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\serve.py", line 248, in api_server run_api_server(args.model_path, File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\openai\api_server.py", line 994, in serve VariableInterface.async_engine = AsyncEngine( File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 67, in init self._build_turbomind(model_path=model_path, File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 107, in _build_turbomind from lmdeploy import turbomind as tm File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\init.py", line 22, in bootstrap() File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\init__.py", line 15, in bootstrap assert CUDA_PATH is not None, 'Can not find $env:CUDA_PATH' AssertionError: Can not find $env:CUDA_PATH然后报错变成这个了

irexyc commented 2 months ago

ok，这个报错正常了。

你缺少CUDA_PATH 这个环境变量。

可以这么设置一下（需要重启powershell 才能生效），不要抄我的，按照你之前log，你是有cuda11.8的，所以你要设置成这个 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8

或者你可以先在powershell 手动临时设置一下，然后再用命令启动 $env:CUDA_PATH="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8"

EdwardBuck1 commented 2 months ago

谢谢大佬我明白了，我去吃个饭，回来弄一下，应该没问题了，非常感谢耐心的解答！

irexyc commented 2 months ago

不过你这个卡应该跑不了，不算视觉部分，7b的模型，需要14G才能加载，你的卡只有8G

lvhan028 commented 2 months ago

@irexyc @lvhan028 两位前辈，顺带一提，lmdeploy什么时候可以像xtuner一样有一个自己的交流群呀（而不是使用internlm2的...

已经创建好了。readme中webchat链接有二维码

irexyc commented 2 months ago

你可以试试这个模型，应该可以跑 deepseek-ai/deepseek-vl-1.3b-chat

# 一些额外的依赖
pip install git+https://github.com/deepseek-ai/DeepSeek-VL.git --no-deps
pip install torchvision --extra-index-url https://download.pytorch.org/whl/cu118
pip install attrdict timm

lmdeploy serve api_server deepseek-ai/deepseek-vl-1.3b-chat --cache-max-entry-count 0.3

EdwardBuck1 commented 2 months ago

G:\Python-3.9.12>lmdeploy serve api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 23333 Add dll path {dll_path}, please note cuda version should >= 11.3 when compiled with cuda 11 urllib3.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:1129)

The above exception was the direct cause of the following exception:

urllib3.exceptions.ProxyError: ('Unable to connect to proxy', SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "G:\Python-3.9.12\lib\site-packages\requests\adapters.py", line 486, in send resp = conn.urlopen( File "G:\Python-3.9.12\lib\site-packages\urllib3\connectionpool.py", line 847, in urlopen retries = retries.increment( File "G:\Python-3.9.12\lib\site-packages\urllib3\util\retry.py", line 515, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /liuhaotian/llava-v1.6-vicuna-7b/resolve/main/config.json (Caused by ProxyError('Unable to connect to proxy', SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)'))))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "G:\Python-3.9.12\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "G:\Python-3.9.12\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "G:\Python-3.9.12\Scripts\lmdeploy.exe__main__.py", line 7, in sys.exit(run()) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\entrypoint.py", line 18, in run args.run(args) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\serve.py", line 248, in api_server run_api_server(args.model_path, File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\openai\api_server.py", line 994, in serve VariableInterface.async_engine = AsyncEngine( File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 67, in init self._build_turbomind(model_path=model_path, File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 108, in _build_turbomind self.engine = tm.TurboMind.from_pretrained( File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\turbomind.py", line 426, in from_pretrained model_source = get_model_source(pretrained_model_name_or_path) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\utils.py", line 63, in get_model_source config = get_hf_config_content(pretrained_model_name_or_path, kwargs) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\utils.py", line 50, in get_hf_config_content config_path = get_hf_config_path(pretrained_model_name_or_path, kwargs) File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\utils.py", line 43, in get_hf_config_path config_path = hf_hub_download(pretrained_model_name_or_path, File "G:\Python-3.9.12\lib\site-packages\huggingface_hub\utils_validators.py", line 119, in _inner_fn return fn(*args, kwargs) File "G:\Python-3.9.12\lib\site-packages\huggingface_hub\file_download.py", line 1261, in hf_hub_download metadata = get_hf_file_metadata( File "G:\Python-3.9.12\lib\site-packages\huggingface_hub\utils_validators.py", line 119, in _inner_fn return fn(args, kwargs) File "G:\Python-3.9.12\lib\site-packages\huggingface_hub\file_download.py", line 1674, in get_hf_file_metadata r = _request_wrapper( File "G:\Python-3.9.12\lib\site-packages\huggingface_hub\file_download.py", line 369, in _request_wrapper response = _request_wrapper( File "G:\Python-3.9.12\lib\site-packages\huggingface_hub\file_download.py", line 392, in _request_wrapper response = get_session().request(method=method, url=url, params) File "G:\Python-3.9.12\lib\site-packages\requests\sessions.py", line 589, in request resp = self.send(prep, send_kwargs) File "G:\Python-3.9.12\lib\site-packages\requests\sessions.py", line 703, in send r = adapter.send(request, kwargs) File "G:\Python-3.9.12\lib\site-packages\huggingface_hub\utils_http.py", line 68, in send return super().send(request, args, kwargs) File "G:\Python-3.9.12\lib\site-packages\requests\adapters.py", line 513, in send raise ProxyError(e, request=request) requests.exceptions.ProxyError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /liuhaotian/llava-v1.6-vicuna-7b/resolve/main/config.json (Caused by ProxyError('Unable to connect to proxy', SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)'))))"), '(Request ID: 9bcb5713-8d53-47d0-995e-f019c4660b04)')这应该是最后一个问提了，救命。。。。。

lvhan028 commented 2 months ago

urllib3.exceptions.ProxyError: ('Unable to connect to proxy', SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))

It looks like downloading the model failed. Try to download the model manually and then pass its local path to the lmdeploy serve api_server

BUJIDAOVS commented 2 months ago

windows支持这么差吗，从安装到推理量化部署，报错没停过。有在win上跑通的案例吗

lvhan028 commented 2 months ago

@irexyc May provide a best practice guide on windows platform

InternLM / lmdeploy