zieren / wasted-youth-tracker

Limit kids' time on their (Windows) PC and get a summary of the window titles.
GNU General Public License v3.0
5 stars 0 forks source link

MySQL regular expressions don't support utf8 :-( #33

Closed zieren closed 3 years ago

zieren commented 3 years ago

https://stackoverflow.com/questions/19774618/mysql-regex-utf-8-characters

It seems the MySQL RegEx library doesn't support utf8, so e.g. "täst" does not match "t.st", but "t..st". So when using wildcards one needs to keep in mind that characters may be multiple bytes but wildcards only match one byte. Word boundaries etc. will likely also not work, which would be annoying.

Consider using latin1 instead of utf8. YouTube video titles often have utf8 emoji characters, but those aren't really needed for matching. They will show as garbage though.

zieren commented 3 years ago

latin1 is good old iso-8859-1. Pointers: https://www.php.net/function.utf8-decode (to process strings received from Autohotkey) https://dev.mysql.com/doc/refman/5.7/en/charset-mysql.html

zieren commented 3 years ago

Fixed in e2e7d17b6ffc708162ba92cec2cadc72be7c3e8e.