huggingface / tgi-gaudi

Large Language Model Text Generation Inference on Habana Gaudi
http://hf.co/docs/text-generation-inference
Apache License 2.0
28 stars 47 forks source link

Adding Universal Assisted Generation #244

Open edlee123 opened 3 weeks ago

edlee123 commented 3 weeks ago

Feature request

Would be awesome if we can add Universal Assisted Generation capability:

https://huggingface.co/blog/universal_assisted_generation

One-liner: https://www.linkedin.com/posts/korat_transformers-now-supports-speculative-activity-7257430488041091073-qY2w?utm_source=share&utm_medium=member_desktop one-liner.

Requires transformers 4.46

Motivation

1.5-2x inference speedup using assistant model.

Your contribution

I can help try this out if I can get help where to start?