issues
search
pentium3
/
sys_reading
system paper reading notes
235
stars
12
forks
source link
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
#366
Open
pentium3
opened
8 months ago
pentium3
commented
8 months ago
https://arxiv.org/pdf/2401.09670v1.pdf
https://arxiv.org/pdf/2401.09670v1.pdf