pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.54k stars 5.75k forks source link

Resolve the chaos usage of `Constant` #53485

Open YangKeao opened 1 month ago

YangKeao commented 1 month ago

The Constant struct has the following definition:

// Constant stands for a constant value.
type Constant struct {
    Value   types.Datum
    RetType *types.FieldType
    // DeferredExpr holds deferred function in PlanCache cached plan.
    // it's only used to represent non-deterministic functions(see expression.DeferredFunctions)
    // in PlanCache cached plan, so let them can be evaluated until cached item be used.
    DeferredExpr Expression
    // ParamMarker holds param index inside sessionVars.PreparedParams.
    // It's only used to reference a user variable provided in the `EXECUTE` statement or `COM_EXECUTE` binary protocol.
    ParamMarker *ParamMarker
    hashcode    []byte

    collationInfo
}

It can represent three different things:

  1. A constant literal in SQL, which is simply represented as a Datum in the Value.
  2. A function which needs to be evaluated when the plan cache is finally used. For example, the NOW() in SELECT NOW().
  3. A parameter (?) in the SQL. It's a constant during the execution of a single statement, and is actually stored in session context.

There are two problems:

  1. *ParamMarker includes a full session context and uses the SessionVars in it, which will be reset and is not safe to read when the session is about to execute the next statement.
  2. Many codes use the Constant.Value directly to read the value. It actually depends on the logic of SetParameterValuesIntoSCtx to set the param.Datum = val, and the expressionRewriter.Leave to create a ParamMarker based on the value. The whole dependency is quite fragile. If the statement uses plan cache, I'm not sure whether it's still correct.
YangKeao commented 1 month ago

Using Constant.Value directly actually brings in some problem:

create table t (v bigint);
prepare stmt5 from 'select * from t where v = -?;';
set @arg=1;
execute stmt5 using @arg;
set @arg=-9223372036854775808;
execute stmt5 using @arg;

It'll have no warnings and give you an error ERROR 1815 (HY000): expression eq(test.t.v, unaryminus(-9223372036854775808)) cannot be pushed down. However, if you run the following statement in a new session:

create table t (v bigint);
prepare stmt5 from 'select * from t where v = -?;';
set @arg=-9223372036854775808;
execute stmt5 using @arg;

It'll give you a warning.

YangKeao commented 1 month ago

Using Constant.Value directly actually brings in some problem:

create table t (v bigint);
prepare stmt5 from 'select * from t where v = -?;';
set @arg=1;
execute stmt5 using @arg;
set @arg=-9223372036854775808;
execute stmt5 using @arg;

It'll have no warnings. However, if you run the following statement in a new session:

create table t (v bigint);
prepare stmt5 from 'select * from t where v = -?;';
set @arg=-9223372036854775808;
execute stmt5 using @arg;

It'll give you a warning.

Oops. I realized that it's not related to Constant.Value. It's just a plan cache issue. I'll track it in #53504 .

YangKeao commented 1 month ago

This issue is related to the usage of Constant.Value:

create table t (v varchar(16));
insert into t values ('156');
prepare stmt7 from 'select * from t where v = conv(?, 16, 8)';
set @arg=0x6E;
execute stmt7 using @arg;
execute stmt7 using @arg;
set @arg=0x70;
execute stmt7 using @arg;
mysql> create table t (v varchar(16));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into t values ('156');
Query OK, 1 row affected (0.00 sec)

mysql> prepare stmt7 from 'select * from t where v = conv(?, 16, 8)';
Query OK, 0 rows affected (0.00 sec)

mysql> set @arg=0x6E;
Query OK, 0 rows affected (0.00 sec)

mysql> execute stmt7 using @arg;
+------+
| v    |
+------+
| 156  |
+------+
1 row in set (0.00 sec)

mysql> execute stmt7 using @arg;
+------+
| v    |
+------+
| 156  |
+------+
1 row in set (0.00 sec)

mysql> set @arg=0x70;
Query OK, 0 rows affected (0.01 sec)

mysql> execute stmt7 using @arg;
+------+
| v    |
+------+
| 156  |
+------+
1 row in set (0.00 sec)

It'll always give you 156 :facepalm: . I track this issue in https://github.com/pingcap/tidb/issues/53505

YangKeao commented 1 month ago

Now, it has two issues:

  1. Hide the internal Value to avoid using .Value, which may cause unexpected behavior (as shown above).
  2. Pass the EvalCtx to GetType and GetUserVar to read the param from the context. It can help us to reach the goal of detach an executor from the current session, which is helpful for both lazy cursor fetch and plan cache across multiple sessions. https://github.com/pingcap/tidb/issues/53533