Calculating attention mechanism

daoyuan98 / Relation-CZSL

Official implementation of the TMM paper "Relation-aware Compositional Zero-shot Learning for Attribute-Object Pair Recognition".

MIT License

8 stars 1 forks source link

Calculating attention mechanism #2

Open ans92 opened 2 years ago

ans92 commented 2 years ago

Hi, Thank you for this great work and code. I want to ask could you please refer me the lines in the code where you are calculating the attention as shown in equation 1 of the paper? Actually I want to know what are W, K and q in equation 1. So if you can refer these things in you code then that would be great. Thank you.

daoyuan98 commented 2 years ago

Hi, Thank you for your attention! Please refer to the following lines for attention computation. https://github.com/daoyuan98/Relation-CZSL/blob/ea76ed8dbb8c2d24e93b37540ec0c7ca4e408736/model/SepMask.py#L240 https://github.com/daoyuan98/Relation-CZSL/blob/ea76ed8dbb8c2d24e93b37540ec0c7ca4e408736/model/SepMask.py#L241

ans92 commented 2 years ago

Hi @daoyuan98 Thank you for your response. Yes I have seen that. I have observed that you have instantiated your key, query and values by normal distribution. So they are getting semantic information of the concept like Red Apple? Normally what I observe is people use word2vec or similar pre-train language models to get the concept embedding and then train it. So you get semantic information by normal distribution values. Please correct me if I understood wrong. Thank you.