The-AI-Summer / self-attention-cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
https://theaisummer.com/
MIT License
1.18k stars 154 forks source link

Do the encoder modules incorporate positional encoding? #1

Closed jfkback closed 3 years ago

jfkback commented 3 years ago

I am wondering if I use say the LinformerEncoder if I have to add the position encoding or if that's already done? From the source files it doesn't seem to be there, but I'm not sure how to include the position encoding as they seem to need the query which isn't available when just passing data directly to the LinformerEncoder. I very well may be missing something any help would be great. Perhaps an example using positional encoding would be good.

black0017 commented 3 years ago

Hello, I haven't implemented the sinusoidal positional encodings. Is that what you mean by position encoding?

I didn't find any info on pos. encodings on the Linformer paper so I guess they use sinusoid pos. encoding.

Would it be of any use to implement them for you?

I think it's not a big deal.

jfkback commented 3 years ago

Specifically I was wondering if the positional encoding model provided in the pos_embeddings directory is something that is meant to add positional encoding. I may have misunderstood the meaning of pos_embeddings though.

Either way if you would be able to implement sinusoidal positional encodings that would be great. I'm trying to use the Linformer as part of a larger model where position will be very much important. Thanks!

black0017 commented 3 years ago

Ok great let me know how it works.

I added the sin positional encoding from the vanilla transformer: https://github.com/The-AI-Summer/self-attention-cv/blob/main/examples/pos_emb_1d.py#L19

Thanks for the issue I think its a useful addition

To understand the diff between pos_embeddings and pos_encodings check my latest article: https://theaisummer.com/positional-embeddings/