Open edlee123 opened 3 weeks ago
Would be awesome if we can add Universal Assisted Generation capability:
https://huggingface.co/blog/universal_assisted_generation
One-liner: https://www.linkedin.com/posts/korat_transformers-now-supports-speculative-activity-7257430488041091073-qY2w?utm_source=share&utm_medium=member_desktop one-liner.
Requires transformers 4.46
1.5-2x inference speedup using assistant model.
I can help try this out if I can get help where to start?
Feature request
Would be awesome if we can add Universal Assisted Generation capability:
https://huggingface.co/blog/universal_assisted_generation
One-liner: https://www.linkedin.com/posts/korat_transformers-now-supports-speculative-activity-7257430488041091073-qY2w?utm_source=share&utm_medium=member_desktop one-liner.
Requires transformers 4.46
Motivation
1.5-2x inference speedup using assistant model.
Your contribution
I can help try this out if I can get help where to start?