li-plus / chatglm.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4
MIT License
2.84k stars 327 forks source link

Allocate kv-cache on demand #277

Closed li-plus closed 4 months ago