Closed Watebear closed 3 years ago
Yes, you are right. The attention module designed here is different from the attention module design in NLP task. We don't necessarily need to follow the attention module design with key and query components, as long as it follows the idea of "attended features", then it can be called attention-based methods.
Hello, I read your paper recently and thank you for sharing your codes. I tried to read the architecture mentioned in the paper, but I can't find the attention module. Your code' format is different with other common Pytorch code. In my understanding, you implement attention by 1x1 depth-wise conv, and then merged the knowledge. I is different with attention in NLP, for example, the Multi-head Attention, have a k\q\v vector. Would you please help me, thank you very much.