nouhadziri / THRED

The implementation of the paper "Augmenting Neural Response Generation with Context-Aware Topical Attention"
https://arxiv.org/abs/1811.01063
MIT License
111 stars 25 forks source link

Confused about the topic attention in the original paper #11

Closed Helicqin closed 5 years ago

Helicqin commented 5 years ago

Sorry to bother you with this. I have read your great paper, but some confusion about the topic attention.

In the paper, you said:

The topic words {t1,t2,...,tn} are then linearly combined to form a fixed-length vector k. The weight values are calculated as the following: 捕获

I hardly figure it out. Is it the same as normal query-key-value attention? In my opinion, the final context-level encoder hidden state serves as a query, the word embedding of topic words serve as values. But how are the weights \beta calculated?

Look forward to your reply! Thanks.

ehsk commented 5 years ago

Thanks for your interest in our work. Your guess is correct. It is a regular attention and the equation for learning the weights is as follows:

image

We have made a few changes (including this one) in the paper, which will be published shortly.

Helicqin commented 5 years ago

Thanks for your reply!