[Feature Request] Support InternLM Model

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

https://arxiv.org/abs/2309.17453

MIT License

6.38k stars 355 forks source link

Open vansin opened 8 months ago

vansin commented 8 months ago

thiner commented 6 months ago

I think you should ask the model makers or inference framework authros to support streaming-llm instead.