opengear-project / GEAR

GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM
MIT License
116 stars 7 forks source link