bibanon / eve

Asagi replacement written in Python
16 stars 3 forks source link

Unspecified collation breaks post inserts in recent MySQL versions #19

Closed AGSPhoenix closed 5 years ago

AGSPhoenix commented 5 years ago

https://dev.mysql.com/doc/refman/8.0/en/charset-database.html

  • If CHARACTER SET charset_name is specified without COLLATE, character set charset_name and its default collation are used. To see the default collation for each character set, use the SHOW CHARACTER SET statement or query the INFORMATION_SCHEMA CHARACTER_SETS table.

[...]

  • Otherwise (neither CHARACTER SET nor COLLATE is specified), the server character set and server collation are used.

The create board.sql statements only specify charset, not collation, and thus result in incomparable collations even when the server defaults are set.

https://www.monolune.com/what-is-the-utf8mb4_0900_ai_ci-collation/

AGSPhoenix commented 5 years ago

In my testing on Percona Server 8.0.13, updating the table creation statements to all use utf8mb4 seems to work. From https://stackoverflow.com/a/30074553/432690:

For a supplementary character, utf8[/utf8mb3] cannot store the character at all, while utf8mb4 requires four bytes to store it. Since utf8[/utf8mb3] cannot store the character at all, you do not have any supplementary characters in utf8[/utf8mb3] columns and you need not worry about converting characters or losing data when upgrading utf8[/utf8mb3] data from older versions of MySQL.