rerun-io / rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.
https://rerun.io/
Apache License 2.0
6.27k stars 293 forks source link

Range query misses data when entity is logged with multiple log calls #1218

Open jleibs opened 1 year ago

jleibs commented 1 year ago

See https://github.com/rerun-io/rerun/issues/1215 and https://github.com/rerun-io/rerun/pull/1217 for additional context.

If we are trying to log data about an entity and it ends up getting split over multiple log calls, it's important that we always log the non-primary component before the other data. If we don't, then the range-query will generate an entity missing this data for the start of the range.

Making a collection of log-calls calls "sync'd" in some way (sharing the exact timepoint) would be helpful for cases like this. Additionally we may need to add a bit of special casing in the join implementation.

It seems like any data with a timestamp that exactly matches the timestamp of the primary component should be included in the results, even if it arrived in a second call unless there is another primary component at that time timestamp, in which case :man_shrugging:

emilk commented 1 year ago

~I think syncing the time points is the right solution here~

The arrow store will still treat two log messages (even with identical time stamps) separably based on insertion order. You still end up with multiple rows in the store.

So if multiple rows have identical time stamps the initial range-result is just the first row.

…still, we should strive to have identical times when logging different components of the same entity.