Closed dadap closed 6 years ago
va. qamwIjDaq vay' vIbachlaw':
// Strip away any non-alpha characters (pIqaD and "'" count as alpha)
string = string.replaceAllMapped(new RegExp('[^a-zA-Zß\u0308\'- \-]'),
(m) => '')
The intention was to strip away punctuation characters, it was clearly overzealous to try to capture an whitelist of allowed characters. Maybe instead this should only strip away non-alpha characters with code points < U+128, and let everything else pass through.
Probably some unicode black magic, but if the database language is set to Farsi, Russian, or Chinese, searches work by Klingon entry names, but not by localized definition content. It seems to work for Swedish, so it's not just an issue with the new languages in general, only the non-Latin-alphabet ones.
We'll probably need to make some changes to decompose Swedish accented letters and lowercase Russian text, anyway.