rerun-io / rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.
https://rerun.io/
Apache License 2.0
6.24k stars 288 forks source link

Visible time range queries skip random chunks of data (for 2D points) #5686

Closed roym899 closed 4 months ago

roym899 commented 5 months ago

Describe the bug It seems like the visible time range queries sometimes miss chunks, i.e., one or more consecutive datapoints. I have observed this with the drone example (with Points3D) before, but had trouble reproducing it reliably in that case. I believe this is happening since the caching was introduced.

To Reproduce Check this rrd file.

It contains sampled images over time (logged as colored 2D points) and the image changes every second. When setting the visibility history to -1s and scrolling along the timeline, black parts without any data appear despite data being logged at these times. The sections with missing data change when restarting the viewer, but there always seem to be one or more. Somewhat easy to find by just scrolling along the axis looking for black parts to appear.

https://github.com/rerun-io/rerun/assets/9785832/a1fb93d6-de3b-4fa5-b539-ef6f0b55175c

The issue can be seen at around 0:18, before I'm just showing what the data looks like.

Desktop (please complete the following information):

Rerun version

rerun_py 0.14.1 [rustc 1.74.0 (79e9716c9 2023-11-13), LLVM 17.0.4] x86_64-unknown-linux-gnu release-0.14.1 74f1c23, built 2024-02-29T11:00:55Z

teh-cmc commented 5 months ago

Oooo, nice catch. That's interesting -- off the top of my head this is very likely to be caused by the crazy last-minute shenanigans that went it for read recursivity etc.

We're not too far away from landing the new cached range APIs at this point so maybe the best approach here is to wait for that and see if that fixes it (among many other things...) :thinking:

teh-cmc commented 4 months ago

Somehow, I can still reproduce this on main, where all the APIs have been rewritten from scratch.

I can also reproduce it in single-threaded mode.

Not quite sure what's going on yet, but the API is clearly returning 0 results when it shouldn't.

roym899 commented 4 months ago

Fwiw, I think it might be affected by how quickly I start scrolling around in the data. Like if I just open the rrd file and not touch the viewer for a while it happens less compared to immediately scrolling around while the data is still being loaded.

teh-cmc commented 4 months ago

Interesting, might be related to invalidation then.

teh-cmc commented 4 months ago

I have an automated reproduction of this. It's a pretty humongous bug right in the middle of the range cache, hard to unsee once you've seen it :smile:

I've implemented the same exact bug twice, which is why it exists both in the old and the new. I'm that consistent.

It can only affect offset-based range queries and requires the user to jump the time cursor in a specific pattern to happen, which is why it has gone unnoticed until now.

Should have the PR soon, if the headache finally goes away...