Biasing Attention - Githubissues

BinWang28 / SBERT-WK-Sentence-Embedding

IEEE/ACM TASLP 2020: SBERT-WK: A Sentence Embedding Method By Dissecting BERT-based Word Models

Apache License 2.0

177 stars 27 forks source link

Biasing Attention #12

Closed g-luo closed 3 years ago

g-luo commented 3 years ago

Hello! I had a few questions:

I was wondering if you could bias SBERT's attention on certain words, for example if I have a sentence like "a brown dog and a cat", I would like to bias towards sentences about dogs?
Which model would be best to use on a news dataset? I observed binwang/bert-base-uncased worked fairly well, but I was just wondering based on your expertise.

Thanks so much for your help!

BinWang28 commented 3 years ago

Hi Grace,

To have special attention towards certain words (e.g. words in a dictionary), you may change the second part (word importance). Instead of determining the weights by variance, you can put more weights manually to words of higher interest.
All models are not specially designed for news datasets. I guess it is better to try a few and compare.

Hope it helps.

g-luo commented 3 years ago

Thank you for your quick reply!

When you say to pick the weights manually instead of by variance, do you mean modifying what's being done here? I'm not quite sure where in the repo the word weights are being determined.

BinWang28 commented 3 years ago

Hi Grace,

You can edit this part: https://github.com/BinWang28/SBERT-WK-Sentence-Embedding/blob/889e6691cdb8f962159ad3ae02f7917c342cb34e/utils.py#L150

var_token is the weights for different tokens in a sentence.

g-luo commented 3 years ago

Gotcha; thanks so much!