FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
https://sites.google.com/view/medusa-llm
Apache License 2.0
2.28k stars 154 forks source link

V1.0 prerelease #71

Closed ctlllll closed 9 months ago