lucidrains / rotary-embedding-torch

Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
MIT License
581 stars 45 forks source link

Usage with x-transformers #10

Open sonovice opened 1 year ago

sonovice commented 1 year ago

Is it possibly to easily use axial rotary embeddings with your x-transformers without having to disect the Attention module? At first glance it seems that there is no simple way to just pass an instance of RotaryEmbedding to an x-transformers encoder.

Any help would be appreciated.

lucidrains commented 1 year ago

@sonovice hey Simon :wave:

you are seeing success with axial rotary embeddings, i'm guessing on mel spec?

that's a bit of a personal invention that i haven't broadcasted that much

i can think about integrating it if you share what your experimental results look like

sonovice commented 1 year ago

@lucidrains Hey Phil and thanks for the fast response.

Actually, I didn't have any kind of spectral features in mind (though you just triggered an entire world of new ideas :wink: )

What I would like to try is to recreate something like LayoutLM for musical scores with meaningful 2d relative positional embeddings to capture the relations between musical glyphs in a score page. Your axial rotary embeddings seem like a perfect fit.

EDIT: LayoutLM in a nut shell would be: Take detected (and classified) objects from a text document image, add learned embeddings for x, y, w and h and use these embeddings to do things like paragraph classification etc. with it.

sonovice commented 1 year ago

@lucidrains I finally found some time to look at this again. Would you be open to a pull request against x-transformers if I manage to introduce this?

alvitawa commented 9 months ago

@sonovice I'm looking into doing something similar (but different domain). Can I ask if you succeeded in trying 4d rotary embeddings?