Open nivibilla opened 2 months ago
Hi, I'm one of the authors of this paper. Thank you for your interest in our work! We plan to release the code soon, hopefully in a few weeks.
@ramyaprabhu-alt Just curious the code release would be a separate project or a PR against vLLM? I think it's a PR, right?
Our initial release will be as a separate project on a slightly older version of vLLM. But soon after, we can also raise a PR against vLLM-latest.
Glad to share the source code of vAttention. Please check it out here: https://github.com/microsoft/vattention
🚀 The feature, motivation and pitch
Claim major improvements over vllm. Unfortunately no code only the paper.
arxiv.org/abs/2405.04437
Alternatives
No response
Additional context
No response