[Feature] blazing great work about KV Cache: Mooncake

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

https://lmdeploy.readthedocs.io/en/latest/

Apache License 2.0

4.33k stars 390 forks source link

Open zhyncs opened 3 months ago

zhyncs commented 3 months ago

TLDR This system has undergone large-scale deployment and validation in the kimi. It has great reference value.

cc @lzhangzz @grimoire @lvhan028

No response

No response

zhyncs commented 3 months ago

zhyncs commented 3 months ago

ref: https://zhuanlan.zhihu.com/p/706097807

A detailed explanatory article by @feifeibear

lvhan028 commented 3 months ago

@irexyc Let's conduct a deep dive into this great work

zhyncs commented 3 months ago