travel-go / Abstractive-Text-Summarization

Contrastive Attention Mechanism for Abstractive Text Summarization
Other
40 stars 7 forks source link

❓ Question : Is contrastive attention applied to one single head or all heads ? #2

Open astariul opened 4 years ago

astariul commented 4 years ago

I'm confused about one detail in your paper.

From Figure 1, it seems the contrastive attention mecanism is applied to each decoder layer :

image

But from the text, it is mentioned that each attention head have different responsability and you hand-picked the best one :

image


So I'm not sure if the contrastive mecanism is applied to every head of the model or only the one you chose.