Lookahead Decoding Development Roadmap

hao-ai-lab / LookaheadDecoding

Apache License 2.0

1.05k stars 63 forks source link

Lookahead Decoding Development Roadmap #13

Open Viol2000 opened 8 months ago

Viol2000 commented 8 months ago

Software Quality

[ ] Refactor Code #9
[ ] Simple way to add new model

Implementation

[x] Support FlashAttention
[x] Support Sampling
[ ] Support Batch>1
[ ] Lookahead window KV-Cache (May hurt accuracy)
[ ] Verification branch trie

New Models

[ ] Baichuan #11
[ ] QWen #22

qspang commented 6 months ago

Does this project support the vicuna model?

Viol2000 commented 6 months ago

Does this project support the vicuna model?

Vicuna is already supported because it is based on LlamaForCausalLM.

qspang commented 6 months ago

Thank you for your reply!Do you mean that I can use the following code to observe the acceleration effect of the vicuna model? USE_LADE=1 python applications/chatbot.py --model_path meta-llama/vicuna-7b-v.13 --debug #no chat, with lookahead USE_LADE=0 python applications/chatbot.py --model_path meta-llama/vicuna-7b-v.13 --debug #no chat, without lookahead

Viol2000 commented 6 months ago

Thank you for your reply!Do you mean that I can use the following code to observe the acceleration effect of the vicuna model? USE_LADE=1 python applications/chatbot.py --model_path meta-llama/vicuna-7b-v.13 --debug #no chat, with lookahead USE_LADE=0 python applications/chatbot.py --model_path meta-llama/vicuna-7b-v.13 --debug #no chat, without lookahead

It should be lmsys/vicuna-7b-v1.3 and yes.

qspang commented 6 months ago

Got it!Thank you for your reply again!

LiweiPE commented 4 months ago

Hi, Im really interesting in this decoding development. Is there any progress to integrate in Qwen model? thanks.