Closed aptend closed 3 months ago
根本原因是。 in/prefix_in 这类表达式的 第二个参数是一个 Expr_List
而Expr_List的内容,目前是不支持无法折叠的函数的,但是这个语句里 current_account_id() 无法折叠。得想办法让这些函数在执行期先折叠才行。类似的函数还有(now(), current_date(), rand()等)
但是现在
select * from x where a = current_date() or (a = 0 and c in ('mo_catalog')); //不会报错,
这是因为我们还有另外一个bug
mysql> explain select * from x where a > current_date();
+----------------------------------------------------+
| TP QURERY PLAN |
+----------------------------------------------------+
| Project |
| -> Table Scan on db1.x |
| Filter Cond: (cast(x.a AS BIGINT) > 19892) |
+----------------------------------------------------+
这里不应该折叠 current_date的
the query results were wrong when taking current_account_id as one of filter conditions. repro:
mysql> create table x ( a int unsigned, b int, c varchar, primary key(a, c));
Query OK, 0 rows affected (0.03 sec)
mysql> insert into x values (0, 1, "mo_catalog"), (0, 2, "mysql"), (0, 4, "task");
Query OK, 3 rows affected (0.02 sec)
mysql> select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
+------+------+------------+
| a | b | c |
+------+------+------------+
| 0 | 1 | mo_catalog |
+------+------+------------+
1 row in set (0.00 sec)
mysql> select * from x where a = 0 or (a = 0 and c in ('mo_catalog'));
+------+------+------------+
| a | b | c |
+------+------+------------+
| 0 | 1 | mo_catalog |
| 0 | 2 | mysql |
| 0 | 4 | task |
+------+------+------------+
3 rows in set (0.00 sec)
mysql> select current_account_id();
+----------------------+
| current_account_id() |
+----------------------+
| 0 |
+----------------------+
1 row in set (0.00 sec)
insert into x values(1,1,"mo_catalog");
mysql> select * from x;
+------+------+------------+
| a | b | c |
+------+------+------------+
| 0 | 1 | mo_catalog |
| 0 | 2 | mysql |
| 0 | 4 | task |
| 1 | 1 | mo_catalog |
+------+------+------------+
4 rows in set (0.00 sec)
mysql> select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
+------+------+------------+
| a | b | c |
+------+------+------------+
| 0 | 1 | mo_catalog |
| 0 | 1 | mo_catalog |
+------+------+------------+
mysql> explain analyze select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| TP QURERY PLAN |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| Project |
| Analyze: timeConsumed=0ms waitTime=0ms inputRows=2 outputRows=2 InputSize=64bytes OutputSize=64bytes MemorySize=0bytes |
| -> Table Scan on a.x |
| Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=4 outputRows=2 InputSize=224bytes OutputSize=64bytes MemorySize=228bytes |
| Filter Cond: prefix_in(x.__mo_cpkey_col, serial(current_account_id()), '>Fmo_catalog ') |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
fixed the duplicated rows.
A brief.
Phenomenon:
create table x ( a int unsigned, b int, c varchar, primary key(a, c));
insert into x values (0, 1, "mo_catalog"), (0, 2, "mysql"), (0, 4, "task"),(1,1,"mo_catalog");
mysql> select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
+------+------+------------+
| a | b | c |
+------+------+------------+
| 0 | 1 | mo_catalog |
| 0 | 1 | mo_catalog |
+------+------+------------+
the expected output is:
+------+------+------------+
| a | b | c |
+------+------+------------+
| 0 | 1 | mo_catalog |
| 0 | 2 | mysql |
| 0 | 4 | task |
+------+------+------------+
Cause:
Both disttae reader and pipeline wrongly handle the prefix-in expression Expr_F( Func["prefix_in"](nargs=2) Expr_Col(x.__mo_cpkey_col) Expr_Vec(typ:<id:61 notNullable:true width:65535 table:"x" >, vec:<len:2 data:"0, 0-mo_catalog" > ))
0
selects three rows(mo_catalog
,mysql
,task
), while prefix 0-mo_catalog
selects the mo_catlaog
row again. @gouhongshen fix this in https://github.com/matrixorigin/matrixone/pull/17041PrefixIn
will fail to find the prefix in the case prefix_in('0-mysql', ['0', '0-catalog'])
, where sort.Find will return 2, but 0 is the right answer. @aunjgr is kind to help with this. https://github.com/matrixorigin/matrixone/blob/a014b808cd236a5b68baf10657cf9f821a002081/pkg/sql/plan/function/func_prefix.go#L134-L136 PR merged
Is there an existing issue for the same bug?
Branch Name
main
Commit ID
46de2e1b1
Other Environment Information
Actual Behavior
Expected Behavior
no error and return normally.
Steps to Reproduce