dennlinger / summaries

A toolkit for summarization analysis and aspect-based summarizers
MIT License
11 stars 0 forks source link

Unify `Analyzer` for sample vs. dataset-wide usage. #37

Open dennlinger opened 1 year ago

dennlinger commented 1 year ago

Currently, some of the functions of Analyzer work on the level of a singular sample, whereas other functions work on an entire input dataset at once.

It would make sense to restructure the Analyzer to work with a more streamlined interface, or otherwise separate concerns a bit more stringently (possibly dedicating a separate class Deduplicator would be helpful.

Also, the current tool is entirely focused on single-document summarization. Deduplication, for example, might not be necessary (or at least to a lesser degree) if we intend to keep duplications alive in an MDS setting, where the exact content might still vary.