autoliuweijie / K-BERT

Source code of K-BERT (AAAI2020)
https://ojs.aaai.org//index.php/AAAI/article/view/5681
949 stars 212 forks source link

attention socre problem #54

Open zhaiyutong opened 3 years ago

zhaiyutong commented 3 years ago

Hi, thank you for your awesome work and code firstly!

When I am transformering your Pytorch code to Tensorflow, I encountered one question.

In your code, you handle the attention mask with visual matrix in bert_encoder.py , and then in your multi_headed_attn.py, you have the following code in the line 59

scores = scores + mask

I am wandering if that corresponds to the attention socre function (5) in your paper? the mask is the addtional M?

Thank you in advance for your responese

autoliuweijie commented 3 years ago

Yes, the mask is represented as the matrix M in our paper.