issues
search
pentium3
/
sys_reading
system paper reading notes
235
stars
12
forks
source link
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
#316
Open
pentium3
opened
10 months ago
pentium3
commented
10 months ago
https://arxiv.org/pdf/2401.02669.pdf
https://arxiv.org/pdf/2401.02669.pdf