flrngel / understanding-ai

personal repository
36 stars 6 forks source link

Generating Wikipedia By Summarizing Long Sequences #5

Open flrngel opened 6 years ago

flrngel commented 6 years ago

https://arxiv.org/abs/1801.10198 published as a conference paper at ICLR 2018


1. Introduction

2. Related Work

2.1. Other datasets used in neural abstractive summarization

2.2. Task involving wikipedia

3. English wikipedia as a multi-document summarization dataset

Data Augmentation from this paper

  1. Search Google with section title
  2. Collect 10 page results except for wiki page of document itself
  3. Remove clone
  4. S_i (Search Result) for D

4. Methods and models

  1. Select input subset
  2. train abstractive model

4.1. Extractive stage

tf-idf was best from extractive stage (See Table 3.)

4.2. Abstractive stage

4.2.1. Data representation

4.2.2. Baseline models

T-ED for typical Transformer encoder-decoder

4.2.3. Transformer Decoder (T-D)

T-D for better performanced baseline model, just remember the formula below image image

4.2.4. Transformer decoder with memory-compressed attention (T-DMCA)

T-DMCA is the final model from this paper

Local attention

Memory-compressed attention

They use LMLML architecture (L for Local attention, M for Memory-compressed attention) image

5. Experiments

5.1. Evaluation