cuba-platform / fts

Full-Text Search Addon
https://www.cuba-platform.com/
Apache License 2.0
4 stars 1 forks source link

Support Analyzer setting and customs analyzers in FTS #17

Closed haulmont-git closed 6 years ago

haulmont-git commented 6 years ago

Apache Lucene has built-in [Analyzers|https://lucene.apache.org/core/6_4_0/analyzers-common/overview-summary.html] to support different languages and their accented letters in search, for example, searching for foo would also find fôo, föo, and fòo. The user should have an ability to select the required analyzer in FTS properties and to force FTS to ignore accents or, vice versa, search only for exact matches of accents. See also: https://www.cuba-platform.com/discuss/t/diacritics-in-full-text-search-match/3329 https://community.alfresco.com/thread/194124-lucene-search-with-accented-characters https://lucene.apache.org/core/7_1_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html#foldToASCII-char:A-int- https://stackoverflow.com/questions/24825662/cant-return-results-for-words-with-accents-on-lucenetika


Original issue: https://youtrack.haulmont.com/issue/PL-10468

gorbunkov commented 6 years ago

Users can override the com.haulmont.fts.core.sys.IndexWriterProviderBean#createAnalyzer method to return any analyzer they want