bmizerany / perks

Effective Computation of Things
BSD 2-Clause "Simplified" License
186 stars 53 forks source link

out-of-order results querying a stream with few samples #5

Closed kr closed 11 years ago

kr commented 11 years ago

We saw strange results with code like the following:

q := quantile.NewTargeted(.50, .95, .99)
q.Insert(x)
// several more Insert calls, only a couple hundred
p50 := q.Query(.50)
p95 := q.Query(.95)
p99 := q.Query(.99)

The values for p50 were sometimes greater than p95 or p99.

reading the source at quantile/stream.go#L111 we see the flat slice of samples being queried directly. But in the "fast path" the slice isn't sorted, though it does get sorted in flush.

msakrejda commented 11 years ago

Reversing the order of inserts in the TestUncompressed test illustrates the issue:

func TestUncompressedUnsorted(t *testing.T) {
       tests := []float64{0.50, 0.90, 0.95, 0.99}
       q := NewTargeted(tests...)
       for i := 100; i > 0; i-- {
               q.Insert(float64(i))
       }
       if g := q.Count(); g != 100 {
               t.Errorf("want count 100, got %d", g)
       }
       // Before compression, Query should have 100% accuracy.
       for _, v := range tests {
               w := v * 100
               if g := q.Query(v); g != w {
                       t.Errorf("want %f, got %f", w, g)
               }
       }
}

gives the output:

maciek@gamera:~/go/src/github.com/bmizerany/perks/quantile$ go test ./...
--- FAIL: TestUncompressedUnsorted (0.00 seconds)
        stream_test.go:114: want 50.000000, got 51.000000
        stream_test.go:114: want 90.000000, got 11.000000
        stream_test.go:114: want 95.000000, got 6.000000
        stream_test.go:114: want 99.000000, got 2.000000
FAIL
FAIL    github.com/bmizerany/perks/quantile     1.181s
bmizerany commented 11 years ago

Working on it.

msakrejda commented 11 years ago

Yep, that seems to do it. Thanks for the quick response--you totally Tom Laned that one.