pingcap / dumpling

Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).
Apache License 2.0
281 stars 85 forks source link

dumpling may dump extra data if `--where` is specified #371

Closed lichunzhu closed 3 years ago

lichunzhu commented 3 years ago

What did you do?

Use dumpling to dump TiDB v5.x data using the --where condition.

What did you expect to see?

Dumpling only dumps data that match where condition.

What did you see instead?

Dumpling dumps some data out of where condition.

Versions of the cluster

Dumpling version (run dumpling -V):

Release version: v5.3.0-alpha-5-gb2388dd-dev
Git commit hash: b2388dd8658bae58a5b8c533ae60aecd577eec08
Git branch:      master
Build timestamp: 2021-10-13 08:46:58Z
Go version:      go version go1.16.4 darwin/amd64```
Source database version (execute `SELECT version();` in a MySQL client):
```console
5.7.25-TiDB-v5.0.0

Other interesting information (system version, hardware config, etc):

>
>
lichunzhu commented 3 years ago

Root cause

https://github.com/pingcap/dumpling/blob/562cb6c021782ce3e35c279a611e7a381979acfd/v4/export/sql.go#L1075

When dumpling builds a where condition, if both --where and --rows are specified, dumpling will generate a where condition which is in the following format:

... WHERE ${where} AND ${chunks_split_sql}

which actually should be:

... WHERE (${where}) AND (${chunks_split_sql})

However, dump may dump incorrectly data if either of the following cases is matched:

  1. --where contains OR operator. Dumpling changes (A∨B)∧C directly to A∨B∧C which breaks the distributivity law.
  2. Dumpling(>=v4.0.9 or >=v5.0.0) dumps a TiDB v5.0+ table which contains a CLUSTERED PRIMARY key which contains at least two columns. In this case ${chunks_split_sql} will contain OR operator generated in https://github.com/pingcap/dumpling/blob/562cb6c021782ce3e35c279a611e7a381979acfd/v4/export/sql.go#L742 which will also breaks the distributivity law.

Comment

(A∨B)∧C = (A∧C)∨(B∧C)