66RING / tiny-flash-attention

flash attention tutorial written in python, triton, cuda, cutlass
194 stars 14 forks source link