hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
https://arxiv.org/abs/2402.02057
Apache License 2.0
1.15k stars 67 forks source link

Can lade accelerate T5? #21

Closed yjdy closed 11 months ago

yjdy commented 12 months ago

Thank for your work. Did you test lade on other generate model, like T5 or Bart? Can it accelerate them?

Viol2000 commented 12 months ago

Hi, currently, we do not support T5/Bart models. But theoretically, encoder-decoder models can be supported, and speedup can be achieved. It requires engineering efforts.