Closed wang2yn84 closed 3 weeks ago
Can you add a short description about this PR? Did this PR fix the performance issue of ragged attention? I also saw some repeat kv change in this PR.
That's because it's based on the another PR. After that one is pushed, I'll rebase and it'll be more clear.
Can you add a short description about this PR? Did this PR fix the performance issue of ragged attention? I also saw some repeat kv change in this PR.
Rebased.
In this PR we did the following: