Closed ioanarusiczki closed 1 month ago
This is probably dependent on what is in the denied names, so we should investigate. Engineers can play around this in a django shell on dev.
DeniedName
is only used for collections names and user display names, so that's consistent. The list in dev does contain "🌠". When encoded in UTF-8 it does start with the same bytes as "🎧", so I think our optimization to check for denied names in one query fails in that case and results in this unwanted match.
Aaaand it's collations again.
On dev:
mysql> SHOW CREATE TABLE users_denied_name;
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| users_denied_name | CREATE TABLE `users_denied_name` (
`id` int NOT NULL AUTO_INCREMENT,
`created` datetime(6) NOT NULL,
`modified` datetime(6) NOT NULL,
`name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=DYNAMIC |
Which matters because:
mysql> SELECT '🌠' = '🎧' COLLATE 'utf8mb4_general_ci';
+----------------------------------------+
| '?' = '?' COLLATE 'utf8mb4_general_ci' |
+----------------------------------------+
| 1 |
+----------------------------------------+
mysql> SELECT '🌠' = '🎧' COLLATE 'utf8mb4_0900_ai_ci';
+----------------------------------------+
| '?' = '?' COLLATE 'utf8mb4_0900_ai_ci' |
+----------------------------------------+
| 0 |
+----------------------------------------+
(Note that on top of this DeniedName
are case-insensitive, which can add complexity into the mix, but that's a different issue).
Also interesting: locally my olympia
database has utf8mb4_0900_ai_ci
as the default collation but my test_olympia
one has utf8mb4_general_ci
...
That is because we specify 'TEST': {'CHARSET': 'utf8mb4', 'COLLATION': 'utf8mb4_general_ci'}
in our settings. That makes our test environment closer to dev/stage/prod, but different from local dev...
This is fixed on AMO dev 🛠 ☑
What happened?
While trying to create a collection on Android I've noticed I've an error when trying to use emojis (on dev and stage) Not reproducible on -prod right now.
Could have to do with some settings from admin , but checking I did not see emojis added in the list for Denied Names.
I've a 400 on collection endpoints for post or patch .
I cannot reproduce this on prod. Also cannot reproduce it for ratings or addon names (on dev, stage)
What did you expect to happen?
On AMO prod
Is there an existing issue for this?
┆Issue is synchronized with this Jira Task