启动Qwen2.5-7B-Instruct模型，能正常多卡启动，但推理失败

joerunfu commented 1 month ago

System Info / 系統信息

CUDA==12.4 ubuntu 24.0 dify==0.8.2(docker-compose部署) GPU：NVIDIA A10 24G*6

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[X] docker / docker
[ ] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

0.15.2

The command used to start Xinference / 用以启动 xinference 的命令

docker stop xinference docker rm xinference docker run -d --name xinference -p 9002:9002 \ --restart=always \ --log-driver json-file \ --log-opt max-size=100m \ --log-opt max-file=2 \ --gpus all \ -e XINFERENCE_MODEL_SRC=modelscope \ -e XINFERENCE_HOME=/workspace \ -v /data/xinference:/workspace \ -v /ai/model:/model \ -v /ai/embedding-model:/embedding-model \ -v /ai/rerank-model:/rerank-model \ -v /ai/image-model:/image-model \ -v /ai/audio-model:/audio-model \ -v /etc/localtime:/etc/localtime:ro \ xprobe/xinference:latest \ xinference-local -H 0.0.0.0 -p 9002

Reproduction / 复现过程

启动模型后，点击拉起web ui进行提问，则失败运行推理时，提示如下：[address=0.0.0.0:38417, pid=1399] CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. 但是单卡启动则没任何问题，请问是什么呢？

Expected behavior / 期待表现

请帮忙修复

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 7 days with no activity.

Royhuiy commented 1 month ago

请问这个问题解决了吗？

tungsten106 commented 1 month ago

这里遇到同样的问题，在A100 (40G) 上启动了两个模型（qwen2.5-7b-instruct 和 qwen2.5-coder-7b-instruct）, 调用v1/chat/completion 推理时报错 address=0.0.0.0:17347, pid=88771] CUDA error: device-side assert triggered\\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1\\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

cyhasuka commented 1 month ago

Same issue.

github-actions[bot] commented 3 weeks ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 5 days since being marked as stale.

xorbitsai / inference