pentium3 / sys_reading

system paper reading notes
229 stars 12 forks source link

Fast Distributed Inference Serving for Large Language Models #351

Open pentium3 opened 3 months ago

pentium3 commented 3 months ago

https://arxiv.org/pdf/2305.05920.pdf

pentium3 commented 3 months ago

https://zhuanlan.zhihu.com/p/648759542