ContextRewriter

This code is for AAAI 2021 paper Contextualized Rewriting for Text Summarization

Python Version: Python3.6

Package Requirements: torch==1.1.0 pytorch_transformers tensorboardX multiprocess pyrouge

Some codes are borrowed from ONMT and PreSumm.

Results

Contextualized rewriter applied to various extractive summarizers on CNN/DailyMail (30/9/2020):

Models	ROUGE-1	ROUGE-2	ROUGE-L	Words
Oracle of BERT-Ext	46.77	26.78	43.32	112
+ ContextRewriter	52.57 (+5.80)	29.71 (+2.93)	49.69 (+6.37)	63
LEAD-3	40.34	17.70	36.57	85
+ ContextRewriter	41.09 (+0.75)	18.19 (+0.49)	38.06 (+1.49)	55
BERTSUMEXT w/o Tri-Bloc	42.50	19.88	38.91	80
+ ContextRewriter	43.31 (+0.81)	20.44 (+0.56)	40.33 (+1.42)	54
BERT-Ext (ours)	41.04	19.56	37.66	105
+ ContextRewriter	43.52 (+2.48)	20.57 (+1.01)	40.56 (+2.90)	66

Model Evaluation

Contextualized rewriter can be evaluated through this experimental scripts. The Lead3, BERTSUMEXT, and BERT-Ext extractive summarizers are included. All the parameters and settings are hard-coded in the py file.

    python src/exp_varext_guidabs.py

The rewriter can also be easily applied to other extractive summarizer using following code. The full example can be found in context_rewriter.py.

    rewriter = ContextRewriter(args.model_file)

    doc_lines = ["georgia high school ...", "less than 24 hours ...", ...]
    ext_lines = ["georgia high school ...", "less than 24 hours ..."]
    res_lines = rewriter.rewrite(doc_lines, ext_lines)

Model Training

Contextualized rewriter can be trained with following script. All the settings are packed into the .py file.

    python src/exp_guidabs.py

By default, the input data path is ./bert_data, and the output model path is ./exp_guidabs

baoguangsheng / ctx-rewriter-for-summ

readme

ContextRewriter

Results

Model Evaluation

Model Training