FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
https://sites.google.com/view/medusa-llm
Apache License 2.0
2.28k stars 154 forks source link

Fix for removing LM_HEAD and upgrading Medusa v2 #103

Closed tgaddair closed 5 months ago