Closed ufownl closed 2 weeks ago
After refactoring the KV cache size calculation, the wrong functor was used to calculate the size of KV cache. It leads to KV cache buffer overflow.
After refactoring the KV cache size calculation, the wrong functor was used to calculate the size of KV cache. It leads to KV cache buffer overflow.