confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
128 stars 1.04k forks source link

Pull queries do not pull data inserted manually to a table #6742

Open spena opened 3 years ago

spena commented 3 years ago

Describe the bug When working with pull queries, I noticed that inserting data to a table using INSERT ... VALUES is not returned when querying the table using a pull query.

I'm not sure if the bug is that pull queries should read the updated table, or if materialized views should not allow any inserts. Anyway, the current behavior allows inserts into a MV. And those inserts are read by Push queries, but not by Pull queries.

To Reproduce I run this with the latest master (6.2.0) version.

Create the stream, the MV; and insert a row into the stream:

create stream s1(id int, name string) with (kafka_topic='s1', value_format='json', replicas=1, partitions=1);
create table t1 as select id, max(id) from s1 group by id;
insert into s1(id, name) values (1, '1');

The row can be queried using push and pull queries:

ksql> select * from t1 emit changes;
+---------------------------+-----------------------------+
|ID                         |KSQL_COL_0                   |
+---------------------------+-----------------------------+
|1                          |1                            |
^CQuery terminated

ksql> select * from t1 where id = 1;
+---------------------------+-----------------------------+
|ID                         |KSQL_COL_0                   |
+---------------------------+-----------------------------+
|1                          |1                            |                                                                                       |
Query terminated

If I insert a row into the MV, I can query it with push queries, but pull queries show the previous row:

insert into t1(id, ksql_col_0) values (1, 2);

ksql> select * from t1 emit changes;
+---------------------------+-----------------------------+
|ID                         |KSQL_COL_0                   |
+---------------------------+-----------------------------+
|1                          |2                            |
^CQuery terminated

ksql> select * from t1 where id = 1;
+---------------------------+-----------------------------+
|ID                         |KSQL_COL_0                   |
+---------------------------+-----------------------------+
|1                          |1                            |                                                                                       |
Query terminated

I waited a few minutes to see if some cache should be updated, but I still got the same row.

Expected behavior I don't know what the behavior should be.

spena commented 3 years ago

@MichaelDrogalis What do you think about this behavior? What should be correct?

MichaelDrogalis commented 3 years ago

I think this is a known bug. @mjsax has more context. I think the idea was that we needed to not make tables writable here, right?

agavra commented 3 years ago

This is probably along the lines of #4177 - @mjsax I added you to the ticket since it falls into the language fundamentals bucket. We probably want to prohibit these type of insert statements.

mjsax commented 3 years ago

Agreed. It does not make sense to insert into the result (that should be modeled as a MV, not a table) of a persistent query.