tdunning / t-digest

A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means
Apache License 2.0
1.97k stars 226 forks source link

Merging MergingDigests sometimes fails sanity check asserts #219

Open Aloshi opened 3 weeks ago

Aloshi commented 3 weeks ago

Hi, I was testing some Kotlin code that uses the tdigest under the hood, and by accident found the following test case causes an AssertionError:

    @Test
    fun `tdigest bug`() {
        val digest1 = MergingDigest(100.0)
        val digest2 = MergingDigest(100.0)

        repeat(4000) {
            if (it < 1000)
                digest1.add(it / 3999.0)
            else
                digest2.add(it / 3999.0)
        }

        digest1.add(digest2)
    }

This fails with:

java.lang.AssertionError
    at com.tdunning.math.stats.MergingDigest.merge(MergingDigest.java:490)
    at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:363)
    at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:353)
    at com.tdunning.math.stats.MergingDigest.add(MergingDigest.java:259)
    at com.tdunning.math.stats.MergingDigest.add(MergingDigest.java:246)
    at com.tdunning.math.stats.AbstractTDigest.add(AbstractTDigest.java:135)
    at mycompany.latencyReporting.digest.QuantileDigestsTest.tdigest bug(QuantileDigestsTest.kt:157)

Which is this assert: assert weight[lastUsedCell - 1] == 1

Interestingly, running digest1.compress() before the merge prevents the issue. I'm wondering if perhaps this is related to #216.

This was with com.tdunning:t-digest:3.3. (And of course, thank you very much for your work!)

tdunning commented 3 weeks ago

Thanks for the report.

Just a hint, I would recommend a value of the compression factor to be 200, not 100. 100 will work, but 200 will work better with a small increase in footprint.

Regarding your error, this is surprising.

Also, I just tried this test on my copy of the code. Here is the test I used for this:

/**
 * Test for Issue #219
 */
@Test
public void testSingletonAfterMerge() {
    TDigest t1 = factory(100).create();
    TDigest t2 = factory(100).create();
    for (int i = 0; i < 4000; i++) {
        if (i < 1000) {
            t1.add(i / 3999.0);
        } else {
            t2.add(i / 3999.0);
        }
        t1.add(t2);
    }
}

Do you confirm that this does what your test does?

tdunning commented 3 weeks ago

Ooops. Transcribed the code incorrectly. Here is my test code: '''java public void testSingletonAfterMerge() { TDigest t1 = factory(100).create(); TDigest t2 = factory(100).create(); for (int i = 0; i < 4000; i++) { if (i < 1000) { t1.add(i / 3999.0); } else { t2.add(i / 3999.0); } } t1.add(t2); } ''' I confirm that I get an assertion error on this.

tdunning commented 3 weeks ago

So, congratulations are in order. If your test had tested for i < 980 or i < 1050, it would not have found the issue. If you had called merge(), you would not have seen this.

To find this, you had to slide right into a narrow zone of error on a rarely used API (almost everybody calls merge)

As a result, you have found the first serious bug in t-digest in quite a few years! Well done.

In the short-run, you can avoid the error by adding a singleton list instead of the single other digest or by using merge. I will have to figure out how to find time to cut a new release.

And if you live anywhere near me, I owe you a beverage of your choice.

Aloshi commented 3 weeks ago

Wow, talk about luck! This is such a battle-hardened library I was worried it was user error. An infinite number of monkeys eventually produces an AssertionError, I suppose. I'm a few hours away from any major cities, but if I ever happen to run into you I'll definitely say hi and get that drink. :)

In the short-run, you can avoid the error by adding a singleton list instead of the single other digest or by using merge. I will have to figure out how to find time to cut a new release.

Switching from digest1.add(digest2) to digest1.add(listOf(digest2)) makes the error in add() go away, but leads to a new error when calling quantile():

val digest1 = MergingDigest(100.0)
val digest2 = MergingDigest(100.0)

repeat(4000) {
    if (it < 1000)
        digest1.add(it / 3999.0)
    else
        digest2.add(it / 3999.0)
}

digest1.add(listOf(digest2)) // now succeeds
assertEquals(0.50, digest1.quantile(0.50), 0.001)  // crashes
arraycopy: last destination index 1056 out of bounds for double[1050]
java.lang.ArrayIndexOutOfBoundsException: arraycopy: last destination index 1056 out of bounds for double[1050]
    at java.base/java.lang.System.arraycopy(Native Method)
    at com.tdunning.math.stats.MergingDigest.merge(MergingDigest.java:381)
    at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:363)
    at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:353)
    at com.tdunning.math.stats.MergingDigest.quantile(MergingDigest.java:702)
    at aff.o11yKt.latencyReporting.digest.QuantileDigestsTest.tdigest bug(QuantileDigestsTest.kt:158)

...or by using merge.

I can't seem to find a public merge() method - am I missing something?

Just a hint, I would recommend a value of the compression factor to be 200, not 100. 100 will work, but 200 will work better with a small increase in footprint.

Thanks for the tip!

tdunning commented 3 weeks ago

I need to dive into this.

This weekend is a bad time, however.

On Fri, Aug 23, 2024 at 9:31 AM Alec Lofquist @.***> wrote:

Wow, talk about luck! This is such a battle-hardened library I was worried it was user error. An infinite number of monkeys eventually produces an AssertionError, I suppose. I'm a few hours away from any major cities, but if I ever happen to run into you I'll definitely say hi and get that drink. :)

In the short-run, you can avoid the error by adding a singleton list instead of the single other digest or by using merge. I will have to figure out how to find time to cut a new release.

Switching from digest1.add(digest2) to digest1.add(listOf(digest2)) makes the error in add() go away, but leads to a new error when calling quantile():

val digest1 = MergingDigest(100.0)val digest2 = MergingDigest(100.0)

repeat(4000) { if (it < 1000) digest1.add(it / 3999.0) else digest2.add(it / 3999.0) }

digest1.add(listOf(digest2)) // now succeeds assertEquals(0.50, digest1.quantile(0.50), 0.001) // crashes

arraycopy: last destination index 1056 out of bounds for double[1050] java.lang.ArrayIndexOutOfBoundsException: arraycopy: last destination index 1056 out of bounds for double[1050] at java.base/java.lang.System.arraycopy(Native Method) at com.tdunning.math.stats.MergingDigest.merge(MergingDigest.java:381) at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:363) at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:353) at com.tdunning.math.stats.MergingDigest.quantile(MergingDigest.java:702) at aff.o11yKt.latencyReporting.digest.QuantileDigestsTest.tdigest bug(QuantileDigestsTest.kt:158)

...or by using merge.

I can't seem to find a public merge() method - am I missing something?

Just a hint, I would recommend a value of the compression factor to be 200, not 100. 100 will work, but 200 will work better with a small increase in footprint.

Thanks for the tip!

— Reply to this email directly, view it on GitHub https://github.com/tdunning/t-digest/issues/219#issuecomment-2307423909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5E6XOWBUBOKGUDH4ANY3ZS5PVXAVCNFSM6AAAAABM66VGQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXGQZDGOJQHE . You are receiving this because you commented.Message ID: @.***>