webis-de / small-text

Active Learning for Text Classification in Python
https://small-text.readthedocs.io/
MIT License
562 stars 61 forks source link

Throwing away stale data #29

Closed vahuja4 closed 1 year ago

vahuja4 commented 1 year ago

@chschroeder - AL deals primarily with selecting data points which need to be added to the training set. Can it also be used to select datapoints which need to be purged out of the training set because they are no longer useful?

chschroeder commented 1 year ago

I haven't come across it yet, but there are approaches like core sets or submodular optimization if you want to downsize your dataset.