[Feature]: implementation of QLoRA on VLLM

bd-iaas-us / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

https://docs.vllm.ai

Apache License 2.0

1 stars 0 forks source link

Closed chenqianfzh closed 3 weeks ago

chenqianfzh commented 2 months ago

No response

No response

chenqianfzh commented 3 weeks ago