vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
22.58k stars 3.18k forks source link

[Feature]: Supporting a version of Consistency LLM #4701

Open usaxena-asapp opened 2 months ago

usaxena-asapp commented 2 months ago

🚀 The feature, motivation and pitch

Consistency LLM: https://hao-ai-lab.github.io/blogs/cllm/ claims to speed up inference. I wonder what version of this we can support in vllm?

Alternatives

No response

Additional context

No response

arunpatala commented 2 months ago

the paper indeed seems interesting ( like a mix of diffusion models + auto regressive models)