risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
7.04k stars 578 forks source link

bug: wrong hidden identity derivation for materialized view #1453

Closed yezizp2012 closed 2 years ago

yezizp2012 commented 2 years ago

After https://github.com/singularity-data/risingwave/issues/1418 fixed, the hidden identity seems like incorrect, it should be _row_id#0, _row_id#1, _row_id#2.

dev=> create table t1 (v1 int, v2 int);
    create table t2 (v3 int, v4 int);
    create table t3 (v5 int, v6 int);

dev=> create materialized view mv1 as select * from t1, t2, t3 where t1.v1 = t2.v3 and t1.v1 = t3.v5;

dev=> select * from mv1;
 v1 | v2 | _row_id#0 | v3 | v5 | v6
----+----+-----------+----+----+----
(0 rows)

Here is the plan:

 StreamMaterialize { columns: [v1, v2, _row_id#0, v3, v4(hidden), _row_id#1(hidden), v5, v6, _row_id#2(hidden)], pk_columns: [_row_id#0, _row_id#1, _row_id#2] }
   StreamHashJoin { type: Inner, predicate: $0 = $6 }
     StreamHashJoin { type: Inner, predicate: $0 = $3 }
       StreamExchange { dist: HashShard([0]) }
         StreamFilter { predicate: true:Boolean AND true:Boolean }
           StreamTableScan { table: t1, columns: [v1, v2, _row_id#0], pk_indices: [2] }
       StreamExchange { dist: HashShard([0]) }
         StreamTableScan { table: t2, columns: [v3, v4, _row_id#0], pk_indices: [2] }
     StreamExchange { dist: HashShard([0]) }
       StreamTableScan { table: t3, columns: [v5, v6, _row_id#0], pk_indices: [2] }
skyzh commented 2 years ago

I'll take a look.

skyzh commented 2 years ago
image

The column mapping doesn't seem correct.

Should be 2->3, 3->4 instead?