apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators
https://tvm.apache.org/
Apache License 2.0
11.66k stars 3.45k forks source link

[KVCache] Support fork in sliding window sink part #17127

Closed cyx-6 closed 3 months ago

cyx-6 commented 3 months ago

This PR adds the support of forking in sliding window attention sink part.

cc: @tqchen @MasterJH5574 @yongwww