66RING / tiny-flash-attention

flash attention tutorial written in python, triton, cuda, cutlass
216 stars 17 forks source link