MariaDB / mariadb-docker

Docker Official Image packaging for MariaDB
https://mariadb.org
GNU General Public License v2.0
751 stars 436 forks source link

Default collation changed between 11.4.1 and 11.4.2 (latest) #591

Closed Qwarctick closed 1 month ago

Qwarctick commented 1 month ago

Hello,

The default collation used in version 11.4.1 was utf8mb4_general_ci. This seems to have changed since 11.4.2, when it was changed to utf8mb4_uca1400_ai_ci.

Is this the expected behavior? I haven't found any evidence of such a change.

11.4.1

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 11.4.1-MariaDB-1:11.4.1+maria~ubu2204 mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> SELECT * FROM INFORMATION_SCHEMA.SCHEMATA;
+--------------+--------------------+----------------------------+------------------------+----------+----------------+
| CATALOG_NAME | SCHEMA_NAME        | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | SQL_PATH | SCHEMA_COMMENT |
+--------------+--------------------+----------------------------+------------------------+----------+----------------+
| def          | information_schema | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
| def          | sys                | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
| def          | mysql              | utf8mb4                    | utf8mb4_general_ci     | NULL     |                |
| def          | performance_schema | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
+--------------+--------------------+----------------------------+------------------------+----------+----------------+

11.4.2

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 11.4.2-MariaDB-ubu2404 mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> SELECT * FROM INFORMATION_SCHEMA.SCHEMATA;
+--------------+--------------------+----------------------------+------------------------+----------+----------------+
| CATALOG_NAME | SCHEMA_NAME        | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | SQL_PATH | SCHEMA_COMMENT |
+--------------+--------------------+----------------------------+------------------------+----------+----------------+
| def          | information_schema | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
| def          | sys                | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
| def          | mysql              | utf8mb4                    | utf8mb4_uca1400_ai_ci  | NULL     |                |
| def          | performance_schema | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |
+--------------+--------------------+----------------------------+------------------------+----------+----------------+
grooverdan commented 1 month ago

The issues in #563 / #560 that we worked around are no longer an issue, hence its back to uca1400_ai_ci.

grooverdan commented 1 month ago

This is 58614cbe55e240e201d95b32b019dd05928b00f8

If you have suggestions on a MariaDB Container release process where these things are less of a suprise I'm happy to hear what can work for you better.

Qwarctick commented 1 month ago

Thanks for the information. I can close the exit.

I have no idea how to improve the information process. Release notes don't apply well to docker images. It might be possible to have a BREAKING_CHANGES.md file containing the changes, but that's probably not the best way to communicate.

This kind of issue with the information will warn the next ones.

grooverdan commented 1 month ago

I have put up release notes before - https://mariadb.com/kb/en/mariadb-10-11-6-release-notes/#docker-official-images - just ran out of attention recently.

A breaking changes in the repo is probably ok, but people will only look when they discovered an issue like you.

I could put the exceptionally noteworthy things as something in the container log on startup. Do you think this has value?

Nefcanto commented 1 month ago

@grooverdan, I found this issue after a long search on the Internet and asking from the communities. Is that change here to remain? Should we write the migration script to migrate all of our servers, databases, tables, and columns to the new collation?

The way I understood it, the new collation is much better in supporting Unicode characters and it's especially useful for multilingual applications. Thus it makes sense to accept it as the new default collation for general puposes.

grooverdan commented 1 month ago

Is that change here to remain?

yes, was deliberate and permanent. the collation becomes the default for utf8mb4 character set if not specified otherwise.

Should we write the migration script to migrate all of our servers, databases, tables, and columns to the new collation?

It would make sense. The collation has been there since 10.11 (and a few short term support releases before it). Fairly invasive as you've probably seen. Hints for other people to follow in this endeavour would be much appreciated.

The way I understood it, the new collation is much better in supporting Unicode characters and it's especially useful for multilingual applications. Thus it makes sense to accept it as the new default collation for general puposes.

That was the plan.