dennlinger / summaries

A toolkit for summarization analysis and aspect-based summarizers
MIT License
11 stars 0 forks source link

Check out single-sentence summaries for `RougeNAligner` #27

Closed dennlinger closed 1 year ago

dennlinger commented 2 years ago

The following snippet produces unexpected results, which should be investigated, and ideally also integrated as a test case:

from summaries.aligners import RougeNAligner
aligner = RougeNAligner()
aligner.extract_source_sentences("This is a short test. Another sentence is here.", "This is a test.")

Expected output:
A single output sentence ("This is a short test.")

Actual observed output: Two sentences (twice the same) in the output. ("This is a short test.", "This is a short test.")

A potential reason could be that it does not correctly split the sentences of the "summary", but will have to check.

dennlinger commented 1 year ago

Turns out that this is due to spacy incorrectly splitting sentences in the above example. For this reason, we cannot do much at the current time, other than recommend the usage of custom splitters that improve upon such cases.