Add Flash Attention - Githubissues

archinetai / audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

MIT License

1.95k stars 169 forks source link

Add Flash Attention #37

Closed zaptrem closed 1 year ago

zaptrem commented 1 year ago

TransformerBlock now uses Flash Attention when use_rel_pos isn't needed.

Note: I only tested this briefly with an untrained model for obvious crashes, and will continue with actual testing once I finish building a new training loop for my models. It works, but the results might not be correct.

flavioschneider commented 1 year ago

Closing as this is no longer relevant to ADP