dadap / flingon-assister

A port of boQwI' to Flutter
Apache License 2.0
7 stars 5 forks source link

Searching by definition in non-Latin DB languages doesn't work #27

Closed dadap closed 6 years ago

dadap commented 6 years ago

Probably some unicode black magic, but if the database language is set to Farsi, Russian, or Chinese, searches work by Klingon entry names, but not by localized definition content. It seems to work for Swedish, so it's not just an issue with the new languages in general, only the non-Latin-alphabet ones.

We'll probably need to make some changes to decompose Swedish accented letters and lowercase Russian text, anyway.

dadap commented 6 years ago

va. qamwIjDaq vay' vIbachlaw':

// Strip away any non-alpha characters (pIqaD and "'" count as alpha)
string = string.replaceAllMapped(new RegExp('[^a-zA-Zß\u0308\'- \-]'),
                                 (m) => '')

The intention was to strip away punctuation characters, it was clearly overzealous to try to capture an whitelist of allowed characters. Maybe instead this should only strip away non-alpha characters with code points < U+128, and let everything else pass through.