skuschel / generatorpipeline

Parallelize your data-processing pipelines with just a decorator.
GNU General Public License v3.0
2 stars 3 forks source link

Implement a single-pass approximate median accumulator #18

Closed r-radloff closed 1 year ago

r-radloff commented 1 year ago

The possibility to calculate the median of a stream of data could be useful for different analytics like the estimation of the detector background. Unlike the mean one cannot calcumate the exact mean of a dataset in a single pass, however it is possible to extimate the median (or any other quantile). A good candidate of an efficient single-pass algorithm to approximate the mean could be the $P^2$-algorithmen proposed by Jain and Chlamtac.

This algorithm has several useful features: