pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.29k stars 5.85k forks source link

txn: read information.tables do not start a txn when auto_commit is false #57321

Open AndreMouche opened 5 days ago

AndreMouche commented 5 days ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

open two session, and set auto_commit=false H4n1uJW9BI

here t1 < t2< t3

2. What did you expect to see? (Required)

here I expect t3 could not get data id=1, which means t1 and t3 should be in one transaction

3. What did you see instead (Required)

t1 and t3 are not in one transaction

4. What is your TiDB version? (Required)

v7.5.2

Meanwhile, I have tested mysql 8.2, t1 and t3 are in the same transaction. and it seems caused by tidb do not read data from tikv when select * from information_schema.tables while mysql would read data from storage for this case

mysql> explain select * from information_schema.tables;
+----------------+----------+------+---------------+---------------+
| id             | estRows  | task | access object | operator info |
+----------------+----------+------+---------------+---------------+
| MemTableScan_4 | 10000.00 | root | table:TABLES  |               |
+----------------+----------+------+---------------+---------------+
1 row in set (0.00 sec)

In other words, for tidb, when auto_commit = false, transactions will not be started when encountering SQL that is not read or written from tikv. Although this meets expectations, it is inconsistent with the behavior of MySQL. should we adjust TiDB's behavior to be consistent with MySQL for this case?

pcqz commented 5 days ago

Most of the tables in infoschema are virtual tables. The executor is built as PhysicalMemTable when querying virtual table individually in non-autocommit mode, and there's no valid transaction info for the query without real start ts. The behavior seems reasonable since the data is retrieved from memory without consistency constraint, but it's not compatible with mysql.

Another place where tidb itself has a difference in the behavior is in cluster table and its virtual table in infoschema, such as cluster_processlist and processlist, which have different execution plans.

mysql> explain select * from information_schema.cluster_processlist;
+-----------------------+----------+-----------+---------------------------+--------------------------------+
| id                    | estRows  | task      | access object             | operator info                  |
+-----------------------+----------+-----------+---------------------------+--------------------------------+
| TableReader_5         | 10000.00 | root      |                           | data:TableFullScan_4           |
| └─TableFullScan_4     | 10000.00 | cop[tidb] | table:CLUSTER_PROCESSLIST | keep order:false, stats:pseudo |
+-----------------------+----------+-----------+---------------------------+--------------------------------+
2 rows in set (0.00 sec)

mysql> explain select * from information_schema.processlist;
+----------------+----------+------+-------------------+---------------+
| id             | estRows  | task | access object     | operator info |
+----------------+----------+------+-------------------+---------------+
| MemTableScan_4 | 10000.00 | root | table:PROCESSLIST |               |
+----------------+----------+------+-------------------+---------------+
1 row in set (0.00 sec)

The TableReader is built using buildNoRangeTableReader which will get snapshot ts and activate the transaction, so querying the cluster table containing virtual table has consistent behavior with mysql.