ROCm / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
142 stars 46 forks source link

Added Dropout BWD #95

Open alexkranias-amd opened 3 weeks ago

alexkranias-amd commented 2 weeks ago

dropout_bwd is not supported currently for