xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.55k stars 357 forks source link

qwen1.5-7b多卡启动报错 #1091

Closed lyn-my closed 2 weeks ago

lyn-my commented 5 months ago

使用xinference launch -u qwen1.5-7b-local -n qwen1.5-7b-local -s 7 -f pytorch --n-gpu 2 --gpu_memory_utilization 0.6 命令多卡启动qwen1.5-7b模型的时候报错

(xinference) skytech@skymachine:~/llm/llm-chat/xinference$ xinference launch -u qwen1.5-7b-local -n qwen1.5-7b-local -s 7 -f pytorch --n-gpu 2 --gpu_memory_utilization 0.6

Launch model name: qwen1.5-7b-local with kwargs: {'gpu_memory_utilization': 0.6} Traceback (most recent call last): File "/home/skytech/miniconda3/envs/xinference/bin/xinference", line 8, in sys.exit(cli()) ^^^^^ File "/home/skytech/miniconda3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/skytech/miniconda3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/home/skytech/miniconda3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/skytech/miniconda3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/skytech/miniconda3/envs/xinference/lib/python3.11/site-packages/click/core.py", line 783, in invoke return __callback(args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/skytech/miniconda3/envs/xinference/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/skytech/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 642, in model_launch model_uid = client.launch_model( ^^^^^^^^^^^^^^^^^^^^ File "/home/skytech/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 837, in launch_model raise RuntimeError( RuntimeError: Failed to launch model, detail: [address=0.0.0.0:36183, pid=2389448] CUDA error: invalid device function CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

error.txt

ChengjieLi28 commented 5 months ago

@lyn-my 测试下你的cuda和torch环境是否正常

import torch
torch.cuda.is_available()
torch.cuda.device_count()
github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 5 days since being marked as stale.