The implementation of coverage mechanism seems to be wrong.

ymfa / seq2seq-summarizer

Pointer-generator reinforced seq2seq summarization in PyTorch

MIT License

358 stars 72 forks source link

The implementation of coverage mechanism seems to be wrong. #11

Open NileZhou opened 5 years ago

NileZhou commented 5 years ago

The author of pointer-generator propose a method called coverage mechanism. Coverage vector, which is the sum of attention distributions over all previous decoder timesteps, but the coverage vector in this repository seems to sum the attention of encoder! Please help me find the correct method to implementation the mechanism or tell me where is my fault.

xcfcode commented 5 years ago

According to the paper "Get-to-the-Point", eq10, coverage vector is the sum of previous encoder attentions.

NileZhou commented 5 years ago

According to the paper "Get-to-the-Point", eq10, coverage vector is the sum of previous encoder attentions.

Thanks, you're right. I want to ask more about the mechanism. When the amount of data is very small (I have 5000 pairs data which consists of content and headline), the model in this repository still likes to repeat itself. My paremeters settings (in params.py) about the coverage mechanism: enc_attn_cover = True cover_func= 'max' cover_loss: float = 1 show_cover_loss = False

I would appreciate it if you could help me !

xcfcode commented 5 years ago

Sorry, I have tried some experiments using this repo but I can not reproduce the result on CNNDM, So, maybe there are still some errors in the repo. Additionally, Pointer-generator uses "sum" as cover_func.

NileZhou commented 5 years ago

Sorry, I have tried some experiments using this repo but I can not reproduce the result on CNNDM, So, maybe there are still some errors in the repo. Additionally, Pointer-generator uses "sum" as cover_func.

Thanks for your help.