mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
https://arxiv.org/abs/2309.17453
MIT License
6.59k stars 361 forks source link

what is the difference between window attention and sliding window recomputation #81

Closed seeyourcell closed 4 months ago

seeyourcell commented 4 months ago

what is the difference between window attention and sliding window recomputation any paper refer? image