Open vytux-com opened 4 years ago
Older versions of MySQL only support UTF characters with 3 bytes.
This excludes emoji - which require 4 bytes.
Newer versions of MySQL support 4 byte UTF.
Supporting 4-byte UTF is easy.
Uprading existing sites is harder.
Another problem is that most servers cannot create indexes on more than 767 bytes.
For 3 byte (utf8), this is 255 characters. For 4 byte (utf8_mb4), this is 191 characters.
We currently have lots of columns with indexes and 255 characters.
Is there an easy way to validate the input to warn or even disable the save button while 4byte uft is present in one field?
Is there an easy way
I can't think of an easy solution or workaround.
Using utf8mb4 on new installations is straightforward. We can create a test table that uses all the features that we need.
CREATE TABLE t (
c VARCHAR(255) COLLATE utf8mb4_unicode_ci,
INDEX(c)
) ENGINE=InnoDB ROW_FORMAT=dynamic;
If this is sucessful, then we can use utf8mb4
. If it fails, then we use utf8
.
We then store the value in data/config.ini.php
.
Updating existing databases has many difficulties. It may be impossible on some servers.
Mysql <= 5.7.6 - does not support utf8mb4
Mysql > 5.7.7 and < 8.0.0 - will support utf8mb4 if innodb_large_prefix
is set.
MySQL >= 8.00 - does support utf8mb4
MariaDB < 10.2.2- does not support utf8mb4
MariaDB >= 10.2.2 and < 10.3.1 - will support utf8mb4 if innodb_large_prefix
is set.
MariaDB > 10.3.1 - does support utf8mb4
I can remember that the same problem was in Nextcloud 15. There the solution was: https://docs.nextcloud.com/server/15/admin_manual/configuration_database/mysql_4byte_support.html
In the admin-backend there came up a message, that my database is not supporting utf8mb4 and I should update this. Maybe this is a way to handle it in webtrees?
Doing ALTER DATABASE nextcloud CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci; for the webtrees-db should be simple.
But I don't know what $ sudo -u www-data php occ maintenance:repair is exactly doing.
ALTER DATABASE nextcloud CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
This simply changes the default used for new tables.
When we create tables/columns, we use an explicit collation - so this default value is never used.
To "upgrade" the database from 3-byte UTF to 4-byte UTF we would need to modify every column. But we can't modify columns if foreign keys exist. So we would need to temporarily delete all foreign keys, and recreate them afterwards. But the foriegn keys can have many names - depending on which version of webtrees created them.
Also, modifying tables is slow. So an automatic upgrade is difficult, because each step may take longer than the webserver timeout limit.
Hope to support utf8mb4_general_ci in version 2.1.0. Thanks!
Note: the surname/statistics code uses utf8_bin
to disable the collation rules.
I got this error when adding a record with a note, my guess it's the UFT images, but if they are not supported they should be cleaned up
PS. name anonymised