ACL-2017-Get To The Point: Summarization with Pointer-Generator Networks

一句话总结：

针对abstractive text summarization task的seq2seq模型有两个缺点：重现的细节不准确，经常重复自己。

这篇文章我们提出一个框架来增强seq2seq, in two orthogonal ways。首先提出一个a hybrid pointer-generator network将source text里的word准确pointing到结果中去，并能通过generator保持产生novel words的能力。第二点，使用coverage来减少repitiion的情况。

资源：

pdf
code
[paper-with-code](

论文信息：

Author: Stanford University, Manning
Dataset:
keywords:

笔记：

2.3 Coverage mechanism

其中min的部分就是减少repitation的Coverage mechanism。

这个coverage loss是用来对repeatedly attending进行惩罚的。这里的Coverage mechanism与原本在NMT领域里的不同。在NMT里，假设a roughly oneto-one translation ratio，如果 final coverage vector大于或小于1的话，会被惩罚。而这里的converge loss只是设定了上限而已，小于等于1。这是因为summarazation更灵活，不需要像NMT那样有uniform coverage, 我们只对有overlap的attention进行惩罚，防止repeated attention.

covloss里使用的coverage vector还被当做了attention meachanism的input。

(原本的attention model)

(添加了converage vector的attention mechanism)

其中w_c是一个learnable parameter vector，长度和v一样长。之所以添加到attention mechanism里，是为了保证当前attention在做决定的时候，是考虑到它的前一个决定的（summarized in c^t）。防止attention mechanism总是关注同一个位置。（这部分对于re来说针对需要吗？）

那么c^t 是如何来的呢？是将前一个decoder timesteps的attention distribution 相加起来的。

模型图：

结果：

接下来要看的论文：

BrambleXu / knowledge-graph-learning

ACL-2017-Get To The Point: Summarization with Pointer-Generator Networks #242