Infini-AI-Lab / TriForce

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
https://infini-ai-lab.github.io/TriForce/
230 stars 12 forks source link

Adapt to open source inference framework #11

Open Siegfried-qgf opened 2 months ago

Siegfried-qgf commented 2 months ago

Have you considered incorporating this work into an open source inference framework, such as vLLM?

preminstrel commented 2 months ago

Yeah, this is a good idea!