Open Aloshi opened 3 weeks ago
Thanks for the report.
Just a hint, I would recommend a value of the compression factor to be 200, not 100. 100 will work, but 200 will work better with a small increase in footprint.
Regarding your error, this is surprising.
Also, I just tried this test on my copy of the code. Here is the test I used for this:
/**
* Test for Issue #219
*/
@Test
public void testSingletonAfterMerge() {
TDigest t1 = factory(100).create();
TDigest t2 = factory(100).create();
for (int i = 0; i < 4000; i++) {
if (i < 1000) {
t1.add(i / 3999.0);
} else {
t2.add(i / 3999.0);
}
t1.add(t2);
}
}
Do you confirm that this does what your test does?
Ooops. Transcribed the code incorrectly. Here is my test code: '''java public void testSingletonAfterMerge() { TDigest t1 = factory(100).create(); TDigest t2 = factory(100).create(); for (int i = 0; i < 4000; i++) { if (i < 1000) { t1.add(i / 3999.0); } else { t2.add(i / 3999.0); } } t1.add(t2); } ''' I confirm that I get an assertion error on this.
So, congratulations are in order. If your test had tested for i < 980
or i < 1050
, it would not have found the issue. If you had called merge()
, you would not have seen this.
To find this, you had to slide right into a narrow zone of error on a rarely used API (almost everybody calls merge)
As a result, you have found the first serious bug in t-digest in quite a few years! Well done.
In the short-run, you can avoid the error by adding a singleton list instead of the single other digest or by using merge. I will have to figure out how to find time to cut a new release.
And if you live anywhere near me, I owe you a beverage of your choice.
Wow, talk about luck! This is such a battle-hardened library I was worried it was user error. An infinite number of monkeys eventually produces an AssertionError
, I suppose. I'm a few hours away from any major cities, but if I ever happen to run into you I'll definitely say hi and get that drink. :)
In the short-run, you can avoid the error by adding a singleton list instead of the single other digest or by using merge. I will have to figure out how to find time to cut a new release.
Switching from digest1.add(digest2)
to digest1.add(listOf(digest2))
makes the error in add() go away, but leads to a new error when calling quantile()
:
val digest1 = MergingDigest(100.0)
val digest2 = MergingDigest(100.0)
repeat(4000) {
if (it < 1000)
digest1.add(it / 3999.0)
else
digest2.add(it / 3999.0)
}
digest1.add(listOf(digest2)) // now succeeds
assertEquals(0.50, digest1.quantile(0.50), 0.001) // crashes
arraycopy: last destination index 1056 out of bounds for double[1050]
java.lang.ArrayIndexOutOfBoundsException: arraycopy: last destination index 1056 out of bounds for double[1050]
at java.base/java.lang.System.arraycopy(Native Method)
at com.tdunning.math.stats.MergingDigest.merge(MergingDigest.java:381)
at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:363)
at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:353)
at com.tdunning.math.stats.MergingDigest.quantile(MergingDigest.java:702)
at aff.o11yKt.latencyReporting.digest.QuantileDigestsTest.tdigest bug(QuantileDigestsTest.kt:158)
...or by using merge.
I can't seem to find a public merge()
method - am I missing something?
Just a hint, I would recommend a value of the compression factor to be 200, not 100. 100 will work, but 200 will work better with a small increase in footprint.
Thanks for the tip!
I need to dive into this.
This weekend is a bad time, however.
On Fri, Aug 23, 2024 at 9:31 AM Alec Lofquist @.***> wrote:
Wow, talk about luck! This is such a battle-hardened library I was worried it was user error. An infinite number of monkeys eventually produces an AssertionError, I suppose. I'm a few hours away from any major cities, but if I ever happen to run into you I'll definitely say hi and get that drink. :)
In the short-run, you can avoid the error by adding a singleton list instead of the single other digest or by using merge. I will have to figure out how to find time to cut a new release.
Switching from digest1.add(digest2) to digest1.add(listOf(digest2)) makes the error in add() go away, but leads to a new error when calling quantile():
val digest1 = MergingDigest(100.0)val digest2 = MergingDigest(100.0)
repeat(4000) { if (it < 1000) digest1.add(it / 3999.0) else digest2.add(it / 3999.0) }
digest1.add(listOf(digest2)) // now succeeds assertEquals(0.50, digest1.quantile(0.50), 0.001) // crashes
arraycopy: last destination index 1056 out of bounds for double[1050] java.lang.ArrayIndexOutOfBoundsException: arraycopy: last destination index 1056 out of bounds for double[1050] at java.base/java.lang.System.arraycopy(Native Method) at com.tdunning.math.stats.MergingDigest.merge(MergingDigest.java:381) at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:363) at com.tdunning.math.stats.MergingDigest.mergeNewValues(MergingDigest.java:353) at com.tdunning.math.stats.MergingDigest.quantile(MergingDigest.java:702) at aff.o11yKt.latencyReporting.digest.QuantileDigestsTest.tdigest bug(QuantileDigestsTest.kt:158)
...or by using merge.
I can't seem to find a public merge() method - am I missing something?
Just a hint, I would recommend a value of the compression factor to be 200, not 100. 100 will work, but 200 will work better with a small increase in footprint.
Thanks for the tip!
— Reply to this email directly, view it on GitHub https://github.com/tdunning/t-digest/issues/219#issuecomment-2307423909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5E6XOWBUBOKGUDH4ANY3ZS5PVXAVCNFSM6AAAAABM66VGQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXGQZDGOJQHE . You are receiving this because you commented.Message ID: @.***>
Hi, I was testing some Kotlin code that uses the tdigest under the hood, and by accident found the following test case causes an AssertionError:
This fails with:
Which is this assert: assert weight[lastUsedCell - 1] == 1
Interestingly, running
digest1.compress()
before the merge prevents the issue. I'm wondering if perhaps this is related to #216.This was with
com.tdunning:t-digest:3.3
. (And of course, thank you very much for your work!)