matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.77k stars 274 forks source link

[Bug]: unsupported expression executor error occurs when selecting #16964

Closed aptend closed 3 months ago

aptend commented 3 months ago

Is there an existing issue for the same bug?

Branch Name

main

Commit ID

46de2e1b1

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

ERROR 20102 (HY000): unsupported expression executor for typ:<id:61 notNullable:true width:65535 table:"x" > list:<list:<typ:<id:61 notNullable:true width:65535 > f:<func:<obj:1086626725888 obj_name:"serial" > args:<typ:<id:27 notNullable:true > f:<func:<obj:1017907249152 obj_name:"current_account_id" > > > > > list:<typ:<id:61 notNullable:true width:65535 > lit:<sval:">\024F\001mo_catalog\000" > > >  now is not yet implemented

Expected Behavior

no error and return normally.

Steps to Reproduce

create table x ( a int unsigned, b int, c varchar, primary key(a, c));

select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
mysql> explain select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));  
+---------------------------------------------------------------------------------------------------+
| TP QURERY PLAN                                                                                    |
+---------------------------------------------------------------------------------------------------+
| Project                                                                                           |
|   ->  Table Scan on db.x                                                                          |
|         Filter Cond: prefix_in(x.__mo_cpkey_col, serial(current_account_id()), '>Fmo_catalog ') |
+---------------------------------------------------------------------------------------------------+


### Additional information

_No response_
ouyuanning commented 3 months ago

根本原因是。 in/prefix_in 这类表达式的 第二个参数是一个 Expr_List

而Expr_List的内容,目前是不支持无法折叠的函数的,但是这个语句里 current_account_id() 无法折叠。得想办法让这些函数在执行期先折叠才行。类似的函数还有(now(), current_date(), rand()等)

ouyuanning commented 3 months ago

但是现在

select * from x where a = current_date() or (a = 0 and c in ('mo_catalog'));  //不会报错,

这是因为我们还有另外一个bug
mysql> explain select * from x where a > current_date();
+----------------------------------------------------+
| TP QURERY PLAN                                     |
+----------------------------------------------------+
| Project                                            |
|   ->  Table Scan on db1.x                          |
|         Filter Cond: (cast(x.a AS BIGINT) > 19892) |
+----------------------------------------------------+
这里不应该折叠 current_date的
aptend commented 3 months ago

the query results were wrong when taking current_account_id as one of filter conditions. repro:

mysql> create table x ( a int unsigned, b int, c varchar, primary key(a, c));
Query OK, 0 rows affected (0.03 sec)

mysql> insert into x values (0, 1, "mo_catalog"), (0, 2, "mysql"), (0, 4, "task");
Query OK, 3 rows affected (0.02 sec)

mysql> select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
+------+------+------------+
| a    | b    | c          |
+------+------+------------+
|    0 |    1 | mo_catalog |
+------+------+------------+
1 row in set (0.00 sec)

mysql> select * from x where a = 0 or (a = 0 and c in ('mo_catalog'));
+------+------+------------+
| a    | b    | c          |
+------+------+------------+
|    0 |    1 | mo_catalog |
|    0 |    2 | mysql      |
|    0 |    4 | task       |
+------+------+------------+
3 rows in set (0.00 sec)

mysql> select current_account_id();
+----------------------+
| current_account_id() |
+----------------------+
|                    0 |
+----------------------+
1 row in set (0.00 sec)
gouhongshen commented 3 months ago
insert into x values(1,1,"mo_catalog");

mysql> select * from x;
+------+------+------------+
| a    | b    | c          |
+------+------+------------+
|    0 |    1 | mo_catalog |
|    0 |    2 | mysql      |
|    0 |    4 | task       |
|    1 |    1 | mo_catalog |
+------+------+------------+
4 rows in set (0.00 sec)

mysql> select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
+------+------+------------+
| a    | b    | c          |
+------+------+------------+
|    0 |    1 | mo_catalog |
|    0 |    1 | mo_catalog |
+------+------+------------+

mysql> explain analyze select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| TP QURERY PLAN                                                                                                                                  |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| Project                                                                                                                                         |
|   Analyze: timeConsumed=0ms waitTime=0ms inputRows=2 outputRows=2 InputSize=64bytes OutputSize=64bytes MemorySize=0bytes                        |
|   ->  Table Scan on a.x                                                                                                                         |
|         Analyze: timeConsumed=0ms waitTime=0ms inputBlocks=1 inputRows=4 outputRows=2 InputSize=224bytes OutputSize=64bytes MemorySize=228bytes |
|         Filter Cond: prefix_in(x.__mo_cpkey_col, serial(current_account_id()), '>Fmo_catalog ')                                               |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

fixed the duplicated rows.

aptend commented 3 months ago

A brief.

Phenomenon:

create table x ( a int unsigned, b int, c varchar, primary key(a, c));
insert into x values (0, 1, "mo_catalog"), (0, 2, "mysql"), (0, 4, "task"),(1,1,"mo_catalog");
mysql> select * from x where a = current_account_id() or (a = 0 and c in ('mo_catalog'));
+------+------+------------+
| a    | b    | c          |
+------+------+------------+
|    0 |    1 | mo_catalog |
|    0 |    1 | mo_catalog |
+------+------+------------+

the expected output is:

+------+------+------------+
| a    | b    | c          |
+------+------+------------+
|    0 |    1 | mo_catalog |
|    0 |    2 | mysql      |
|    0 |    4 | task       |
+------+------+------------+

Cause: Both disttae reader and pipeline wrongly handle the prefix-in expression Expr_F( Func["prefix_in"](nargs=2) Expr_Col(x.__mo_cpkey_col) Expr_Vec(typ:<id:61 notNullable:true width:65535 table:"x" >, vec:<len:2 data:"0, 0-mo_catalog" > ))

  1. Disttae reader outputs duplicate rows: prefix 0 selects three rows(mo_catalog,mysql,task), while prefix 0-mo_catalog selects the mo_catlaog row again. @gouhongshen fix this in https://github.com/matrixorigin/matrixone/pull/17041
  2. PrefixIn will fail to find the prefix in the case prefix_in('0-mysql', ['0', '0-catalog']), where sort.Find will return 2, but 0 is the right answer. @aunjgr is kind to help with this. https://github.com/matrixorigin/matrixone/blob/a014b808cd236a5b68baf10657cf9f821a002081/pkg/sql/plan/function/func_prefix.go#L134-L136
aunjgr commented 3 months ago

PR merged