Waikato / moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
http://moa.cms.waikato.ac.nz/
GNU General Public License v3.0
610 stars 353 forks source link

Incremental Variance #150

Closed blablahaha closed 5 years ago

blablahaha commented 5 years ago

I'm confused with Incremental Variance in ADWIN (concept drift algorithm):

https://github.com/Waikato/moa/blob/79648958a5627bf6b4e5e2345b8a020a4b38d018/moa/src/main/java/moa/classifiers/core/driftdetection/ADWIN.java#L390

I found it is different with this Wikipedia Formula.

Could you help me with this? Thanks

abifet commented 5 years ago

ADWIN computes variance using exponential histograms. It is explained in this paper:

http://ilpubs.stanford.edu:8090/749/

Babcock, Brian and Datar, Mayur and Motwani, Rajeev and O'Callaghan, Liadan (2003) Maintaining Variance and k-Medians over Data Stream Windows. In: ACM Symposium on Principles of Database Systems (PODS 2003), June 9-12, 2003, San Diego, California.