pentium3 / sys_reading

system paper reading notes
235 stars 12 forks source link

Accelerating Retrieval-Augmented Language Model Serving with Speculation #373

Open pentium3 opened 3 months ago

pentium3 commented 3 months ago

https://arxiv.org/abs/2401.14021