Search implementation is based on lunr.js library. Example from #154 is in Hebrew. According to their docs Hebrew language is not supported: https://lunrjs.com/guides/language_support.html
Lunr works by first dividing all strings to tokens, then tokens are run through the pipeline functions. One of the functions is trimmer. Trimmer removes any non-word characters from a token. For not supported languages any non-latin character is trimmed. As a result שלום is trimmed to an empty string and cannot be used in search. To make it work for non-latin based languages, trimmer function must be removed from the pipeline (As described in lunr docs). It would be easy in regular lunr, by adding this line in lunr index initialization:
this.pipeline.remove(lunr.trimmer);
But because topola is using lunr.multi extension, trimmer is dynamically generated based on the list of provided languages and cannot be passed to this.pipeline.remove. Instead, I recreated the logic of the lunr.multi extension in a custom function, omitting generation of the trimmer.
Fix for a #154.
Explanation:
Search implementation is based on
lunr.js
library. Example from #154 is in Hebrew. According to their docs Hebrew language is not supported: https://lunrjs.com/guides/language_support.htmlLunr works by first dividing all strings to tokens, then tokens are run through the pipeline functions. One of the functions is trimmer. Trimmer removes any non-word characters from a token. For not supported languages any non-latin character is trimmed. As a result
שלום
is trimmed to an empty string and cannot be used in search. To make it work for non-latin based languages, trimmer function must be removed from the pipeline (As described in lunr docs). It would be easy in regular lunr, by adding this line in lunr index initialization:this.pipeline.remove(lunr.trimmer);
But because topola is using
lunr.multi
extension, trimmer is dynamically generated based on the list of provided languages and cannot be passed tothis.pipeline.remove
. Instead, I recreated the logic of thelunr.multi
extension in a custom function, omitting generation of the trimmer.