ROCm / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
142 stars 46 forks source link

Ck tile/kvcache #74

Closed rocking5566 closed 3 months ago

rocking5566 commented 3 months ago

Integrate CK's append kv and split kv kernel from this PR https://github.com/ROCm/composable_kernel/pull/1387

Cydia2018 commented 3 months ago

Thanks for your work, I would like to know when paged kv in "flash_attn_varlen_func" will be supported.

see https://github.com/ROCm/flash-attention/blob/ck_improve_v0.1.1/csrc/flash_attn_ck/mha_varlen_fwd.cpp#L186

rocking5566 commented 2 months ago

Thanks for your work, I would like to know when paged kv in "flash_attn_varlen_func" will be supported.

see https://github.com/ROCm/flash-attention/blob/ck_improve_v0.1.1/csrc/flash_attn_ck/mha_varlen_fwd.cpp#L186

This will be supported recently.