mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
https://arxiv.org/abs/2309.17453
MIT License
6.38k stars 355 forks source link

[Feature Request] Support InternLM Model #58

Open vansin opened 8 months ago

vansin commented 8 months ago

https://huggingface.co/internlm https://github.com/InternLM/InternLM

thiner commented 6 months ago

I think you should ask the model makers or inference framework authros to support streaming-llm instead.