NCCL cannot be captured in a graph

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Apache License 2.0

29.95k stars 4.52k forks source link

用一张卡跑7b模型不会报错，用两张卡时报NCCL错误。vllm版本v0.3.2。

command： python -m vllm.entrypoints.openai.api_server --model models/Qwen-14B-Chat --trust-remote-code --port 8509 --served-model-name qwen-14b-chat --dtype float16 --tensor-parallel-size 2

NCCL：NCCL version 2.19.3+cuda11.0 CUDA：11.6

error： NCCL WARN NCCL cannot be captured in a graph if either it wasn't built with CUDA runtime >= 11.3 or if the installed CUDA driver < R465.

看了nvidia官网上nccl的版本，只有三个：

vllm-project / vllm

NCCL cannot be captured in a graph #3069