lucidrains / x-transformers

A concise but complete full-attention transformer with a set of promising experimental features from various papers
MIT License
4.57k stars 391 forks source link

Feature request for Disentangled Attention from DeBERTa #17

Open hadaev8 opened 3 years ago

hadaev8 commented 3 years ago

Paper in case you missed it. https://arxiv.org/abs/2006.03654

lucidrains commented 3 years ago

@hadaev8 I think the novelty in DeBERTa is mainly in their positional encoding, and I'm deliberating whether to create a separate repository where I aggregate all the promising positional encoding solutions (and then make them pluggable into x-transformers)

hadaev8 commented 3 years ago

@lucidrains Please let me know then it happens!

danieltudosiu commented 2 years ago

@lucidrains I second that especially if we are starting to discuss 2D and 3D specific ones.