Self-attn in decoder layers.

jozhang97 / DETA

Detection Transformers with Assignment

Apache License 2.0

243 stars 20 forks source link

Self-attn in decoder layers. #16

Open TsingWei opened 1 year ago

TsingWei commented 1 year ago

I noticed there is a section about DETA does not need self-attention in the decoder. in the paper. The results show that when the self-attn is replaced by ffn in decoder, the performance is better. I wonder whether the final version in the table of compared-with-other-SOTAs using this setting? Because I found in the code that the self-attn is hard-coded in the decoder layer: https://github.com/jozhang97/DETA/blob/dade1763efba58a1f3077d373e991fd319dc240e/models/deformable_transformer.py#L328

jozhang97 commented 1 year ago

Thank you for your interest!

I wonder whether the final version in the table of compared-with-other-SOTAs using this setting? No, our default model still contains self-attention.