vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
26.84k stars 3.94k forks source link

[Misc]: vLLM v0.6.0 CUDA 12 missing wheel file #8362

Open JasmondL opened 6 days ago

JasmondL commented 6 days ago

Anything you want to discuss about vllm.

I'd like to understand why the most recent release is omitting the CUDA 12 wheel package built?

Before submitting a new issue...

youkaichao commented 6 days ago

the wheels in pypi https://pypi.org/project/vllm/ should be cuda 12

cc @simon-mo if the release pipeline changed, the release page https://github.com/vllm-project/vllm/releases/tag/v0.6.0 indeed only has cuda 11 wheels.