Flash Attention - Githubissues

jrzaurin / pytorch-widedeep

A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch

Apache License 2.0

1.26k stars 186 forks source link

Closed kd1510 closed 10 months ago

kd1510 commented 10 months ago

Add option to use flash attention in MultiHeadAttention (raises attribute error if pytorch not 2.0)
Torch flash attention from scaled_dot_product_attention - output from heads concatenated using einops.
Benchmark tests for checking that the flash implementation is on average faster (by at least 10%).

kd1510 commented 10 months ago

Closing to recreate on main fork.