Open dennlinger opened 2 years ago
After applying it to my own dataset (Klexikon) and being surprised that there were additional samples being removed, I realized that it is really necessary to either give the option to have the exact same thing done through Analyzer
, or else have a feature in Cleaner
that prints out the full sample at the time of removal. One way could be to actually store the samples that are affected, or give otherwise some more information.
Currently, basic stats on the removal are logged by the system, but no information is retained on the removal by split, for example. This could mean that basically the entire test set is removed, but we would not get this information from the logger statements.
Also, current integration is very minimal, and will print out everything, even without user consent, which might not be so desirable.