dennlinger / summaries

A toolkit for summarization analysis and aspect-based summarizers
MIT License
11 stars 0 forks source link

Add processing of pre-split sentence for Rouge2Aligner (or aligner in general) #17

Closed dennlinger closed 2 years ago

dennlinger commented 2 years ago

Given that our Klexikon corpus comes with pre-split sentences, and their pre-processing is more extensive what our general library should/can do, it makes sense to differentiate between inputs that are on the sentence-level already split.

One caveat is that we will have to design it such that multi-document summarization cases will be possible to distinguish as well.

dennlinger commented 2 years ago

Resolved in 9b18acb. Unfortunately, the processing is now around 3x slower, as individual sentences still have to be processed with spacy, for the lemmatization.