RAISEDAL / RAISEReadingList

This repository contains a reading list of Software Engineering papers and articles!
0 stars 0 forks source link

Paper Review: SciSummPip: An Unsupervised Scientific Paper Summarization Pipeline #52

Open asifsamir opened 1 year ago

asifsamir commented 1 year ago

Publisher

Association for Computational Linguistics

Link to The Paper

https://aclanthology.org/2020.sdp-1.37/

Name of The Authors

Ju JLiu MGao L et al.

Year of Publication

2020

Summary

This work of graph based extractive text summarization for scientific documents are motivated by the SumPip paper. However, their pipeline varies. They also introduced two new steps to control length of summary and remove irrelevant sentences. Also, this work is single document summarization compared to SumPip.

Graph Creation: Before creating sentence graph, they ranked sentence pairs using PageRank and stored in a matrix. Lowered scored sentences are deleted from candidate list. Then they created graph where each node is a sentence and the edges are connected iff there are one of the four patterns- deverbal noun reference, same entity continuation, discourse markers, and sentence similarity using cosine similarity. This similarity is found using SciBERT, a custom embedding for scientific docs.

Spectral Clustering: Spectral clustering is applied. From the Clusters Using Multi-sentence compression (MSC) one sentence taken from each cluster. Then these sentences are combined to create summary.

For comparison, they used different embedding systems. SciBERT, SummPip, SBERT. In ROGUE analysis, SciBERT performed better compared to others.

Contributions of The Paper

Comments

Maximal Marginal Relevance(MMR) performs better compared to generic ones for summarization. Need to check. textRank model(Barrios et al., 2016) with the Okapi BM25 similar-ity function. need to check.

Need to recheck this paper. Has inconsistency (or I am missing sth?). Showing abstractive summarization while it is unsupervised. How?