Waikato / moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
http://moa.cms.waikato.ac.nz/
GNU General Public License v3.0
610 stars 353 forks source link

Active Learning Tab #125

Closed corneliusboehm closed 6 years ago

corneliusboehm commented 6 years ago

This update introduces a separate tab for Active Learning. It contains newly implemented Active Learning algorithms as well as some extensions to the graphical user interface.

Active Learning

So far, the following two Active Learning algorithms are supported, but further ones will be added in the future:

Graphical User Interface Extensions

The tab's graphical interface is based on the Classification tab, but some additional functionality has been added:

Result table

The result preview has been updated from a simple text field showing CSV data to an actual table: Result Table

Hierarchy of Tasks

In order to enable convenient and fast evaluation that provides reliable results, we introduced new tasks with a special hierarchy:

  1. ALPrequentialEvaluationTask: Perform prequential evaluation for any chosen active learner.
  2. ALMultiParamTask: Compare different parameter settings for the same algorithm by performing multiple ALPrequentialEvaluationTasks.
  3. ALPartitionEvaluationTask: Split a data stream into several partitions and perform an ALMultiParamTask on each one. This allows for cross-validation-like evaluation.

The tree structure of those tasks and their parameters are now also indicated in the task overview panel on top of the window: taskoverview

Evaluation

The introduced task hierarchy requires an extended evaluation scheme. In all of these graphs, color coding is used for better distinction of different runs. Each type of task has its own evaluation style:

For ALMultiParamTasks and ALPartitionEvaluationTasks there are also two more types of evaluation: Any selected measure can be inspected in relation to the value of the varied parameter. partitionevaluation_variedparameter

The same can be done with regard to the true label acquisition rate, because this measure, often also called budget, is very important in active learning applications. partitionevaluation_labelacqrate