Open feherbbj opened 5 months ago
Pinging @elastic/es-search (Team:Search)
PS: Tested with 8.12.2 and it is reproducible there as well
FYI, I tested in new Lucene and it is still reproducible. In the new Lucene version, we update ICU version.
Pinging @elastic/es-search-relevance (Team:Search Relevance)
is there any news on this topic?
No, other than it seems like a bug :)
Elasticsearch Version
7.16.3
Installed Plugins
ICU plugin
Java Version
bundled
OS Version
Win 10
Problem Description
I have an index, where a text field contains text from numerous languages (French, Hungarian...etc). In those languages the accented characters are quite common. The text field type is wildcard. If the field contains an accented character with an uppercase and a wildcard search is performed on the field, with case insensitivity the document is not found. The document should be found (see sample at reproduction, searching for "tést" or "á" does not return any result (even though case insensitity set to true), while "tÉst" and "Á" does return.
Steps to Reproduce
DELETE test-accent PUT test-accent { "mappings": { "properties": { "text": { "type": "wildcard" } } } }
POST test-accent/_doc/ { "text": "tÉst" } POST test-accent/_doc/ { "text": "Á" } POST test-accent/_doc/ { "text": "E" }
GET test-accent/_search { "query": { "match_all": {} } } GET test-accent/_search { "query": { "wildcard": { "text": { "value" : "á", "case_insensitive": true } } } } GET test-accent/_search { "query": { "wildcard": { "text": { "value" : "tést", "case_insensitive": true } } } }
Logs (if relevant)
No response