lucidrains / x-transformers

A concise but complete full-attention transformer with a set of promising experimental features from various papers
MIT License
4.63k stars 395 forks source link

Feature request: add local and reformer #188

Open samvanstroud opened 1 year ago

samvanstroud commented 1 year ago

Thanks for this repo. Is there a possibility of adding your existing local attention and reformer implementations here?

I'm hoping they may also be able to be updated to take advantage of the upcoming attention mask support for the meff kernel in PT2.1.

lucidrains commented 1 year ago

yeah, I do have plans to make it so one can register custom transformer blocks. probably will be tested with mixture of experts first https://github.com/lucidrains/st-moe-pytorch, but will prob also consider local attention