Open taygun opened 2 years ago
Hmm yes I can confirm the issue you are seeing, it seems to be affecting queries to the /v1/autocomplete
endpoint but not the /v1/search
endpoint, which helps narrow down the scope.
We use the icu-folding filter in elasticsearch to 'fold' the Cyrillic form to the Latin form.
It seems as though we are using this filter correctly in all of the analyzers, with the exception of peliasHousenumber which has a numeric
character filter, and so it doesn't apply.
I'm not really sure what's going on here, the expected behaviour is that we fold Cyrillic to ASCII for precisely this purpose.
Ah, very nice discovery @missinglink. I think we originally discovered this issue back in https://github.com/pelias/pelias/issues/833 but never narrowed down the cause.
It feels like adding the icu-folding
filter is relatively safe, maybe we should try that out?
Describe the bug When searching for the address ("Олега Оникієнка вулиця 77а") of this OSM place no result are returned. The issue seems to be caused by the fact the the address is indexed with Cyrillic "a". If the query search contains the Cyrillic character "a", the above address is returned.
Steps to Reproduce
Steps to reproduce the behavior: No results returned when searched with Latin Small Letter A: pelias.github.io Result returned when searched with Cyrillic Small Letter A: pelias.github.io
Expected behavior Expected the address to be returned when using Latin character