jquery / infrastructure-puppet

Puppet configuration for jQuery Infrastructure servers.
MIT License
6 stars 9 forks source link

jQuery Blog fails converts some non-ASCII characters to question marks #11

Closed mgol closed 9 months ago

mgol commented 2 years ago

The jQuery blog at https://blog.jquery.com/ doesn't support some non-ASCII characters, like the ones in my name: ł, ę. If you enter them anywhere, they get converted to question marks.

The issue looks related to the database as it's observed regardless of if you enter such a letter when authoring a blog post or in a plain text field when setting First/Last Name on the profile settings page.

The same issue doesn't exist in the UI (https://blog.jqueryui.com/) & Mobile (https://blog.jquerymobile.com/) blogs despite them being on the same WordPress version (5.8.1).

cc @timmywil @Krinkle

Krinkle commented 1 year ago

Example post, last sentence of https://blog.jquery.com/2021/10/07/jquery-maintainers-update-and-transition-jquery-ui-as-part-of-overall-modernization-efforts/

Additional gratitude is owed to Micha? Go??biowski-Owczarek for preparing

Let's revisit this after moving blogs to a new server, ref https://github.com/jquery/infrastructure-puppet/issues/6.

supertassu commented 1 year ago

The wp_posts table for the jQuery blog uses latin1, which probably explains this issue.We can try to convert it to utf8mb4, but I think I want to switchover traffic first.

supertassu commented 1 year ago

For future me:

MariaDB [information_schema]> select table_schema, table_name, table_collation from tables where table_schema like 'wordpress%' and table_collation not like 'utf8%';
+------------------------+------------------------+-------------------+
| table_schema           | table_name             | table_collation   |
+------------------------+------------------------+-------------------+
| wordpress_jquerymobile | wp_gcb                 | latin1_swedish_ci |
| wordpress_jquery       | wp_comments            | latin1_swedish_ci |
| wordpress_jquery       | wp_users               | latin1_swedish_ci |
| wordpress_jquery       | wp_links               | latin1_swedish_ci |
| wordpress_jquery       | wp_options             | latin1_swedish_ci |
| wordpress_jquery       | wp_redirection_modules | latin1_swedish_ci |
| wordpress_jquery       | wp_redirection_items   | latin1_swedish_ci |
| wordpress_jquery       | wp_redirection_logs    | latin1_swedish_ci |
| wordpress_jquery       | wp_redirection_groups  | latin1_swedish_ci |
| wordpress_jquery       | wp_posts               | latin1_swedish_ci |
| wordpress_jquery       | wp_w3tc_cdn_queue      | latin1_swedish_ci |
| wordpress_jquery       | wp_usermeta            | latin1_swedish_ci |
| wordpress_jquery       | wp_postmeta            | latin1_swedish_ci |
+------------------------+------------------------+-------------------+
13 rows in set (0.001 sec)
supertassu commented 9 months ago

Generated the SQL with:

MariaDB [information_schema]> SELECT GROUP_CONCAT(CONCAT('ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME, ' CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;') SEPARATOR "\n")  AS command FROM INFORMATION_SCHEMA.TABLES WHE
RE table_schema LIKE 'wordpress%' AND table_collation NOT LIKE 'utf8%'\G
*************************** 1. row ***************************
command: ALTER TABLE wordpress_jquerymobile.wp_gcb CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_comments CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_users CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_links CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_options CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_redirection_modules CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_redirection_items CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_redirection_logs CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_redirection_groups CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_posts CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_w3tc_cdn_queue CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_usermeta CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
ALTER TABLE wordpress_jquery.wp_postmeta CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
1 row in set (0.002 sec)

And executed that. Everything seems to work correctly now.