pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.19k stars 5.84k forks source link

TiDB sees both old and updated row in the single SELECT result #51211

Open Tema opened 8 months ago

Tema commented 8 months ago

Bug Report

I was playing with TiDB transactions and discovered an interesting artifact which I can’t explain. Basically in case of parallel updates the transaction was able to see both original row and the updated one together returned by a single SELECT statement.

1. Minimal reproduce step (Required)

Create table:

# primary key is important
create table t (i int primary key);
Insert into t values (0);

Start one transaction:

Txn_1: 
BEGIN;
Select * from t;
+---+
| i |
+---+
| 0 |
+---+

Update in the separate transaction:

Txn_2: 
begin; 
update t set i = 1; 
commit;

Update same table in the original transaction:

Txn_1:
Update t set i = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

2. What did you expect to see? (Required)

Original transaction should see only updated value:

Txn_1:
mysql> select * from t;
+---+
| i |
+---+
| 2 |
+---+

I actually observe this, but only after commit statement.

3. What did you see instead (Required)

Original transaction see both values:

mysql> Select * from t;
+---+
| i |
+---+
| 0 |
| 2 |
+---+

4. What is your TiDB version? (Required)

| Release Version: v7.6.0
Edition: Community
Git Commit Hash: 52794d985ba6325d75a714d4eaa0838d59425eb6
Git Branch: heads/refs/tags/v7.6.0
UTC Build Time: 2024-01-22 14:13:53
GoVersion: go1.21.5
Race Enabled: false
Check Table Before Drop: false
Store: tikv |
cfzjywxk commented 8 months ago

This behavior arises due to some ambiguity in semantics when mixing current read and snapshot read scenarios.

For example, in TiDB, suppose txn_1 uses an update statement with current read. This corresponds to KV operations such as deleting row 1 and inserting row 2 into TiDB's memory, while also locking both rows. Now, when executing select * from t, which is a snapshot read, it needs to merge the results from the memory write with the snapshot read. Depending on the timestamp used for the snapshot read, the final result may be 0 (from TiKV snapshot) and 2 (from memory). However, this issue does not occur when using select for update.

There are some known issues related to the mixed usages of snapshot read and current read, for example https://github.com/pingcap/tidb/issues/44200

MySQL exhibits similar behavior: img_v3_0289_bf2813a3-0234-469b-8d9c-b731640a58bg

Tema commented 5 months ago

This corresponds to KV operations such as deleting row 1 and inserting row 2 into TiDB's memory

Now, when executing select * from t, which is a snapshot read, it needs to merge the results from the memory write with the snapshot read.

If update is indeed represented as delete and insert, then it looks like the initial snapshot data is merged with insert row result only, but not with delete row result.