deephaven / deephaven-core

Deephaven Community Core
Other
257 stars 80 forks source link

rollup <- lastby <- blink table appears to have stale data #6394

Closed niloc132 closed 1 day ago

niloc132 commented 2 days ago

At this time we don't have a more clear way to reproduce this - a blink time_table did not reproduce the issue in earlier testing.

Some rollups created from a lastby seem to have inconsistent contents - some rows seem up to date, while others can appear to be still showing the update. In this example we use a table publisher and a lastby to produce an updating table controlled by the user, and a rollup which at some steps will clearly not have data consistent with the lastby table.

Note that expanding/collapsing/scrolling will get a fresh snapshot from the server, one which is apparently up to date again, so all tables must be left visible as directed. This was originally reproduced with many thousands of rows updated at a time, but in this example we will just update one row at a time.

Steps to reproduce:

  1. Set up the tables - run this python in the web IDE

    from deephaven.stream.table_publisher import table_publisher
    from deephaven import dtypes
    from deephaven import agg
    from deephaven.table_factory import new_table
    from deephaven.column import string_col
    
    table_def = {
        'id': dtypes.string,
        'org': dtypes.string,
        'batchid': dtypes.string
    }
    
    blink_table, publisher = table_publisher(name='repro', col_defs=table_def)
    table = blink_table.last_by('id')
    rollup = table.rollup(aggs=[agg.max_('batchid')], by=['org'])
  2. Rearrange the UI so that all three tables are visible side by side
  3. Create a single row via the table publisher
    publisher.add(new_table(cols=[
        string_col('id', ['a']),
        string_col('org', ['b']),
        string_col('batchid', ['1'])]
    ))

    At this point we expect to see the same row in table as in rollup_table plus the expected hierarchy - there should be two rows, each with batchid=1. Instead, we see no rows.

  4. Create another row via the table publisher, with a larger batchid (so that the "max" shows the new row at each level of the hierarchy
    publisher.add(new_table(cols=[
        string_col('id', ['a']),
        string_col('org', ['b']),
        string_col('batchid', ['2'])]
    ))

    We expect to see two rows here, both with batchid=2. Instead, we see two rows, with the "root" showing batchid=1, and the child with batchid=2.