Closed SDSTony closed 3 years ago
As you know, CTRLsum is an abstractive summary model. This phenomenon is not a big problem for CTRLsum's abstractive summary because the result of the extractive summary is a subset of abstractive summary. Just because the contents of the original article are generated does not make this model an extractive summary model. However, this may occur more frequently because CTRLsum's summaries are generally shorter than other summarization models.
CTRLsum typically generates shorter sentences because only sentences suitable for a given aspect were drawn from the entire gold label. (because it learned to generate sentence-by-sentence rather than paragraph) It would be nice to check the paper if you want more explanation. And, It may also be more helpful to create an issue in the original author's repository because this model is not the model I trained and is just simply ported to the model in the original repository.
As far as I know of, CTRLsum is a model which supports abstractive summarization. On the README example, it also shows that
2. Basic Summarization
performed abstractive summarization because the subject of the sentence has changed compared to the original sentence on thecontent
.Original:
He is a ~
Summarization:Tunip is a~
However, on my custom dataset, all the samples I have tried returned the exact original sentence from the
content
, making me assume that it is performing an extractive summarization.I wonder what would be the cause to this issue?
Thank you.