Open kklemon opened 1 year ago
Inspired by this work, I implemented the Perceiver architecture with out-of-the-box FlashAttention support. It offers a great speedup over a naive implementation and up to 16x increased input sequence lengths for the same hardware.
You can find the project under fast-perceiver.
@kklemon very nice! 🚀
Inspired by this work, I implemented the Perceiver architecture with out-of-the-box FlashAttention support. It offers a great speedup over a naive implementation and up to 16x increased input sequence lengths for the same hardware.
You can find the project under fast-perceiver.