all methods focus on how to select training examples for next generation.
all methods assume that we already have a initially labeled training set. (Except for Hierarchical sampling for active learning. The problem for this method is that it totally abandoned the good nature of active learning.)
our assumptions:
imbalance in initial training set will affect the active learning performance
Hierarchical clustering can balance the initial training set
For new stages, we need to consider expert knowledge. e.g. keyword search through elasticsearch first to retrieve a more balanced initial training set.
Negative Results (on multi-classification problem)
The entropy maximization methods do not make a single difference from random sampling!!!
To Do
reduce the problem to binary classification, target class is minority.
if still does not work out, consider keywork search.
Things done
The entropy maximization methods do not make a single difference from random sampling!!!
To Do