dennlinger / summaries

A toolkit for summarization analysis and aspect-based summarizers
MIT License
11 stars 0 forks source link

Add end-to-end dataset checker #35

Open dennlinger opened 1 year ago

dennlinger commented 1 year ago

This function would speed up the analysis process greatly, by running all available methods with a pre-specified configuration (file?).

As an addition, it might make sense to have a singular "data score" roughly estimating the quality of the dataset. Currently, we have the problem that particular methods produce quite a loaded ouptut that might be too much for interpretation on the fly.

dennlinger commented 1 year ago

41 introduces Cleaner, a utility that can actually reduce the dataset to a (high?) quality subset.

However, this would still leave room for a singular data score, which can be more elaborate than simply filtering based on trivial criteria.