boostorg / accumulators

An awesome library from Boost
http://boost.org/libs/accumulators
22 stars 54 forks source link

Quadratic quantile interpolator does not guarantee monotonicity #62

Open adimajo opened 5 months ago

adimajo commented 5 months ago

In [weighted_][extended_]p_square.hpp, the p-square algorithm (an online - in the sense that it doesn't require storing all samples - quantile estimator) is implemented. Additionally:

The extended version also introduces the ability to use interpolation:

Unfortunately, the computation of the quadratic interpolator polynomial does not incorporate a mechanism to enforce monotonicity such that it is possible/likely to get a mathematically incoherent quantile function, i.e. a an estimated quantile function such that $\exists \; 0 < i < j < 1 \; | \; \hat{q}(i) > \hat{q}(j)$.

To illustrate this claim, the programs MWE3(_4).{cpp,py} do the following:

Schematically, for the final values, in the quantile function space, this gives:

xychart-beta
    title "Estimated quantile function"
    x-axis [0.99, 0.992, 0.993, 0.994, 0.995, 0.996, 0.997, 0.999]
    y-axis "Quantile" 1.9 --> 3
    line [2.32, 2.03, 1.98, 2.00, 2.08, 2.22, 2.43, 3.04]

This phenomenon appears because the last point participating in the quadratic interpolation polynomial, namely 0.9999 is close to 0.999 but the quantiles are far apart, such that the "convexity" required to go through all three points is high.

Since this makes no mathematical sense, I would either issue a strong warning at instantiation or deprecate this interpolator. If an additional interpolator (w.r.t. the linear one) is needed, I would suggest looking into integrating splines which monotonicity can be enforced (see e.g. https://github.com/ttk592/spline/).

Notes: