bd-iaas-us / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
1 stars 0 forks source link

[Feature]: TP support in QLoRA of VLLM #12

Open chenqianfzh opened 3 weeks ago

chenqianfzh commented 3 weeks ago

🚀 The feature, motivation and pitch

Support tensor-parallelism in QLoRA on vllm.

Alternatives

No response

Additional context

No response

chenqianfzh commented 1 week ago

PR: https://github.com/vllm-project/vllm/pull/5813