dennlinger / summaries

A toolkit for summarization analysis and aspect-based summarizers
MIT License
11 stars 0 forks source link

Extend the `Cleaner` removal logging #44

Open dennlinger opened 2 years ago

dennlinger commented 2 years ago

Currently, basic stats on the removal are logged by the system, but no information is retained on the removal by split, for example. This could mean that basically the entire test set is removed, but we would not get this information from the logger statements.

Also, current integration is very minimal, and will print out everything, even without user consent, which might not be so desirable.

dennlinger commented 2 years ago

After applying it to my own dataset (Klexikon) and being surprised that there were additional samples being removed, I realized that it is really necessary to either give the option to have the exact same thing done through Analyzer, or else have a feature in Cleaner that prints out the full sample at the time of removal. One way could be to actually store the samples that are affected, or give otherwise some more information.