vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
27.89k stars 4.12k forks source link

Cupy Import errors in Docker #3184

Open jojac47 opened 7 months ago

jojac47 commented 7 months ago

I'm using a docker with the 12.1 nvidia/cuda container as a base. This worked perfectly for vllm unit the switch to using cupy. The cupy import breaks vllm whenever you use tensor-parallel >1. I've double checked and both the cuda version(12.1) and cupy(cupy-cuda12x) should be compatible. Any advice or guidance on this issue?

rafvasq commented 2 months ago

May be relevant: https://github.com/vllm-project/vllm/pull/3625