pentium3 / sys_reading

system paper reading notes
234 stars 12 forks source link

Efficient Streaming Language Models with Attention Sinks #290

Open pentium3 opened 1 year ago

pentium3 commented 1 year ago

https://github.com/mit-han-lab/streaming-llm