issues
search
pentium3
/
sys_reading
system paper reading notes
234
stars
12
forks
source link
Splitwise: Efficient Generative LLM Inference Using Phase Splitting
#341
Open
pentium3
opened
7 months ago
pentium3
commented
7 months ago
https://arxiv.org/pdf/2311.18677.pdf
https://arxiv.org/pdf/2311.18677.pdf