pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
37k stars 5.82k forks source link

SHOW COLLATION format does not compatible with mysql 8.0 #54250

Closed mzhang77 closed 2 months ago

mzhang77 commented 3 months ago

Bug Report

1. Minimal reproduce step (Required)

mysql> SHOW COLLATION WHERE `Charset` = 'utf8mb4' AND `Default` = 'Yes';
+-------------+---------+------+---------+----------+---------+
| Collation   | Charset | Id   | Default | Compiled | Sortlen |
+-------------+---------+------+---------+----------+---------+
| utf8mb4_bin | utf8mb4 |   46 | Yes     | Yes      |       1 |
+-------------+---------+------+---------+----------+---------+
1 row in set (0.00 sec)

mysql> select @@version;
+--------------------+
| @@version          |
+--------------------+
| 8.0.11-TiDB-v8.1.0 |
+--------------------+
1 row in set (0.01 sec)

2. What did you expect to see? (Required)

The output format should be same as mysql8.0, which is:

mysql> SHOW COLLATION WHERE `Charset` = 'utf8mb4' AND `Default` = 'Yes';
+--------------------+---------+-----+---------+----------+---------+---------------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen | Pad_attribute |
+--------------------+---------+-----+---------+----------+---------+---------------+
| utf8mb4_0900_ai_ci | utf8mb4 | 255 | Yes     | Yes      |       0 | NO PAD        |
+--------------------+---------+-----+---------+----------+---------+---------------+
1 row in set (0.01 sec)

mysql> select @@version;
+-----------+
| @@version |
+-----------+
| 8.0.37    |
+-----------+
1 row in set (0.00 sec)

3. What did you see instead (Required)

mysql show collation command returns 7 columns. But tidb show collation command returns 6 columns.

4. What is your TiDB version? (Required)

select @@version;
+--------------------+
| @@version          |
+--------------------+
| 8.0.11-TiDB-v8.1.0 |
+--------------------+
1 row in set (0.01 sec)

Which claims tidb is compatible with mysql 8.0.11

bb7133 commented 3 months ago

This is a known difference:

Default collation:

  • TiDB’s default collation is utf8mb4_bin.
  • MySQL 5.7’s default collation is utf8mb4_general_ci.
  • MySQL 8.0’s default collation is utf8mb4_0900_ai_ci.
mzhang77 commented 3 months ago

This is about the output format of SHOW COLLATION command. mysql returns 7 columns, tidb returns only 6 columns.

dveeden commented 3 months ago

TiDB:

mysql-8.0.11-TiDB-v8.2.0-alpha-444-g761b5926a7-dirty> SHOW COLLATION WHERE Collation='ascii_bin';
+-----------+---------+------+---------+----------+---------+
| Collation | Charset | Id   | Default | Compiled | Sortlen |
+-----------+---------+------+---------+----------+---------+
| ascii_bin | ascii   |   65 | Yes     | Yes      |       1 |
+-----------+---------+------+---------+----------+---------+
1 row in set (0.00 sec)

MySQL:

mysql-8.4.0> SHOW COLLATION WHERE Collation='ascii_bin';
+-----------+---------+----+---------+----------+---------+---------------+
| Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+-----------+---------+----+---------+----------+---------+---------------+
| ascii_bin | ascii   | 65 |         | Yes      |       1 | PAD SPACE     |
+-----------+---------+----+---------+----------+---------+---------------+
1 row in set (0.01 sec)
dveeden commented 3 months ago

Note that the MySQL docs might be outdated: https://bugs.mysql.com/bug.php?id=115444

dveeden commented 3 months ago

Another small difference: Sortlen is always 1 in TiDB. Not sure if the docs for this on https://docs.pingcap.com/tidb/stable/information-schema-collations are correct.

dveeden commented 3 months ago

https://github.com/pingcap/tidb/blob/master/docs/design/2020-01-24-collations.md also has some info on this.

dveeden commented 3 months ago

The formatting difference was because of a missing mysql.NotNullFlag which caused the client to reserve space to print NULL.