MihaiValentin / lunr-languages

A collection of languages stemmers and stopwords for Lunr Javascript library
Other
431 stars 163 forks source link

Improvements to Japanese search index #39

Closed danjarvis closed 7 years ago

danjarvis commented 7 years ago

Define and register a Japanese trimmer. The lunr.ja.wordCharacters were already defined, but the lunr.ja.trimmer function was not. I tested this and confirmed it significantly improves the index.

Refactored the tokenizer to include position metadata for each token. Fixes #38.