codeplaysoftware / cutlass-fork

CUDA Templates for Linear Algebra Subroutines
Other
8 stars 20 forks source link

Add Flash Attention v2 example #156

Open muhammad-tanvir-1211 opened 1 week ago

muhammad-tanvir-1211 commented 1 week ago

Implement the example for Flash Attention v2 for the Intel Xe backend.