Low percentile values do not seem to be calculated correctly

Here are the percentile calculated using this library (first column) and percentiles calculated using github.com/montanaflynn/stats library (second column)

Target percentiles:

calcQuantiles = []float64{0.001, 0.01, 0.03, 0.05, 0.1, 0.2, 0.3, 0.4, 0.50, 0.90}

time="2019-05-24T08:53:00Z" level=info msg="Total: 62748 samples"
time="2019-05-24T08:53:00Z" level=info msg="0.1th: 182, 187"
time="2019-05-24T08:53:00Z" level=info msg="1.0th: 4494, 2826"
time="2019-05-24T08:53:00Z" level=info msg="3.0th: 7982, 8024"
time="2019-05-24T08:53:00Z" level=info msg="5.0th: 11528, 11689"
time="2019-05-24T08:53:00Z" level=info msg="10.0th: 15766, 15519"
time="2019-05-24T08:53:00Z" level=info msg="20.0th: 19606, 19680"
time="2019-05-24T08:53:00Z" level=info msg="30.0th: 20909, 20915"
time="2019-05-24T08:53:00Z" level=info msg="40.0th: 21205, 21206"
time="2019-05-24T08:53:00Z" level=info msg="50.0th: 21347, 21350"
time="2019-05-24T08:53:00Z" level=info msg="90.0th: 21470, 21470"

time="2019-05-24T08:54:00Z" level=info msg="Total: 64905 samples"
time="2019-05-24T08:54:00Z" level=info msg="0.1th: 318, 187"
time="2019-05-24T08:54:00Z" level=info msg="1.0th: 4106, 2838"
time="2019-05-24T08:54:00Z" level=info msg="3.0th: 9910, 8039"
time="2019-05-24T08:54:00Z" level=info msg="5.0th: 11709, 11720"
time="2019-05-24T08:54:00Z" level=info msg="10.0th: 15785, 15536"
time="2019-05-24T08:54:00Z" level=info msg="20.0th: 19635, 19685"
time="2019-05-24T08:54:00Z" level=info msg="30.0th: 20962, 20917"
time="2019-05-24T08:54:00Z" level=info msg="40.0th: 21221, 21207"
time="2019-05-24T08:54:00Z" level=info msg="50.0th: 21351, 21351"
time="2019-05-24T08:54:00Z" level=info msg="90.0th: 21466, 21471"

time="2019-05-24T08:54:52Z" level=info msg="Total: 66775 samples"
time="2019-05-24T08:54:52Z" level=info msg="0.1th: 823, 187"
time="2019-05-24T08:54:52Z" level=info msg="1.0th: 3790, 2844"
time="2019-05-24T08:54:52Z" level=info msg="3.0th: 9398, 8046"
time="2019-05-24T08:54:52Z" level=info msg="5.0th: 11577, 11730"
time="2019-05-24T08:54:52Z" level=info msg="10.0th: 15679, 15538"
time="2019-05-24T08:54:52Z" level=info msg="20.0th: 19635, 19683"
time="2019-05-24T08:54:52Z" level=info msg="30.0th: 20907, 20917"
time="2019-05-24T08:54:52Z" level=info msg="40.0th: 21201, 21208"
time="2019-05-24T08:54:52Z" level=info msg="50.0th: 21346, 21352"
time="2019-05-24T08:54:52Z" level=info msg="90.0th: 21466, 21471"

Method used to log the above

    log.Printf("Total: %d samples", quantiles.Count())
    for _, quantile := range calcQuantiles {
        percentile, err := stats.Percentile(samples, quantile*100)
        if err != nil {
            log.Errorf("failed to calc percentile - %s", err)
        } else {
            log.Infof("%.1fth: %.0f", quantile*100, quantiles.Query(quantile), percentile)
        }
    }
   }

Please note a big difference in low percentile numbers between the 3 print outs above. Even though one would expect some differences at say 0.1th percentile because 3 print outs above contained different sample numbers. Still there is unexpectedly large variability with only a limited sample number change. Also notice that percentile calculated with github.com/montanaflynn/stats does not display the same variability, which rules out the possibility that the samples added were somehow very different to the previous lot.

So overall I think this library calculates low percentiles incorrectly or the method is not really suitable for this.

I have the data that was used for calculating the above and happy to share it. Just let me know if you want it. (it's less than 1Mb file)

bmizerany / perks

Low percentile values do not seem to be calculated correctly #20