Active Learning Tab - Githubissues

This update introduces a separate tab for Active Learning. It contains newly implemented Active Learning algorithms as well as some extensions to the graphical user interface.

Active Learning

So far, the following two Active Learning algorithms are supported, but further ones will be added in the future:

ALRandom: Decides randomly, if an instance should be used for training.
ALZliobaite2011: This class contains four active learning strategies for streaming data that explicitly handle concept drift. They are based on randomization, fixed uncertainty, dynamic allocation of labeling efforts over time and randomization of the search space (Zliobaite et al., 2011). It also contains the Selective Sampling strategy, which is adapted from Cesa-Bianchi et al. (Cesa-Bianchi et al., 2006) and uses a variable labeling threshold.

Graphical User Interface Extensions

The tab's graphical interface is based on the Classification tab, but some additional functionality has been added:

Result table

The result preview has been updated from a simple text field showing CSV data to an actual table: Result Table

Hierarchy of Tasks

In order to enable convenient and fast evaluation that provides reliable results, we introduced new tasks with a special hierarchy:

ALPrequentialEvaluationTask: Perform prequential evaluation for any chosen active learner.
ALMultiParamTask: Compare different parameter settings for the same algorithm by performing multiple ALPrequentialEvaluationTasks.
ALPartitionEvaluationTask: Split a data stream into several partitions and perform an ALMultiParamTask on each one. This allows for cross-validation-like evaluation.

The tree structure of those tasks and their parameters are now also indicated in the task overview panel on top of the window: taskoverview

Evaluation

The introduced task hierarchy requires an extended evaluation scheme. In all of these graphs, color coding is used for better distinction of different runs. Each type of task has its own evaluation style:

ALPrequentialEvaluationTask: Results for one single experiment are shown.
ALMultiParamTask: Results for all parameter configurations are shown in one graph.
ALPartitionEvaluationTask: Mean values and standard deviation calculated over all folds are shown for each parameter configuration.

For ALMultiParamTasks and ALPartitionEvaluationTasks there are also two more types of evaluation: Any selected measure can be inspected in relation to the value of the varied parameter. partitionevaluation_variedparameter

The same can be done with regard to the true label acquisition rate, because this measure, often also called budget, is very important in active learning applications. partitionevaluation_labelacqrate

Waikato / moa

Active Learning Tab #125

Active Learning

Graphical User Interface Extensions

Result table

Hierarchy of Tasks

Evaluation