pentium3 / sys_reading

system paper reading notes
234 stars 12 forks source link

Splitwise: Efficient Generative LLM Inference Using Phase Splitting #341

Open pentium3 opened 7 months ago

pentium3 commented 7 months ago

https://arxiv.org/pdf/2311.18677.pdf