mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
https://arxiv.org/abs/2309.17453
MIT License
6.38k stars 355 forks source link

Can support to codellama34b? #35

Closed willshion closed 9 months ago

Guangxuan-Xiao commented 9 months ago

CodeLlamas are also Llama models, so we have already supported them :). https://huggingface.co/codellama/CodeLlama-34b-Instruct-hf/blob/main/config.json