test_negative_positive_uniform failure

metrics-rs / metrics

A metrics ecosystem for Rust.

MIT License

1.06k stars 145 forks source link

test_negative_positive_uniform failure #462

Open kathoum opened 4 months ago

kathoum commented 4 months ago

Test "test_negative_positive_uniform" occasionally fails on my machine (linux x86_64, rust 1.65.0):

---- summary::tests::test_negative_positive_uniform stdout ----
thread 'summary::tests::test_negative_positive_uniform' panicked at 'assert_relative_eq!(aval, sval, max_relative = distance)

    left  = -5.667140121494932
    right = -5.674032139894776

', metrics-util/src/summary.rs:307:13

To reproduce, run the test repeatedly until it fails (a few hundred repetitions may be required):

while cargo test --release --lib -- test_negative_positive_uniform ; do ; done

kathoum commented 4 months ago

Can be reproduced with the following change: in https://github.com/metrics-rs/metrics/blob/e2c37f4253349333387be659e8d44aa127e1c10d/metrics-util/src/summary.rs#L279

Replace thread_rng() with rand::rngs::mock::StepRng::new(seed.0, seed.1) with seed = (12194833282344109743, 13893952441524021419)

tobz commented 4 months ago

Interesting.

Thanks for the reproduction. Hopefully I can dig into this at some point soon.

tobz commented 4 months ago

As a small update, I did follow your instructions and was also able to reproduce this locally.

Haven't started digging into why it breaks, but good to know the reproduction is solid. 👍🏻

kathoum commented 3 months ago

I looked into the failure and I think the cause is the interpolation: it seems to me that the implementation of DDSketch rounds the 'rank' towards zero.

The line where truncation happens is https://github.com/mheffner/rust-sketches-ddsketch/blob/d6069cba291ed81924139b34c7ba92647b634a42/src/ddsketch.rs#L111

Truncating 'rank' is equivalent to 'Lower' interpolation mode in ndarray, so, to check this, I changed test_positive_uniform and test_negative_positive_uniform to call quantile_axis_mut with Lower instead of Linear: no test failures after >100k runs.

Additionally: with Lower interpolation mode, the assertions pass even with max_relative = alpha (no need to multiply by 2*aval, because assert_relative_eq already performs relative error calculation).

tobz commented 3 months ago

Interesting; nice find!

This makes sense to me from reading your comment, but I want to spend a little time trying to check the official DDSketch libraries to make sure that what rust-sketches-ddsketch is doing is actually the right behavior in the first place... since now you've got me thinking about this. :)

tobz commented 3 months ago

By the way, I haven't forgotten about this.

Work has been busy, and I've been working with one of the DDSketch authors (I work at Datadog) on another internal Rust project that also needs DDSketch... so I'm hoping to figure out both the answer to my above thoughts, and see if that inevitably means we should use the current crate, another crate, or write something entirely new.