hao-ai-lab / LookaheadDecoding

Apache License 2.0
1.04k stars 63 forks source link

fix llama kv cache #38

Closed jiqing-feng closed 6 months ago

jiqing-feng commented 6 months ago

Hi @zhisbug @Viol2000

The llama model needs to cache up with HF as HF uses KV cache in the llama model.

Relate issue: https://github.com/hao-ai-lab/LookaheadDecoding/issues/35

Viol2000 commented 6 months ago

Thanks for your efforts. I will merge the code soon.

jiqing-feng commented 6 months ago

Hi @Viol2000 . Would you please merge this PR? Thx!