Closed ming300 closed 10 years ago
Thanks for taking the time to implement this. The right place for these changes is as a plugin for lunr, there is already a repo containing many language adapters. Open a pull request against that repository, implementing your changes as a plugin (the other language adapters there should give you a good idea of how to structure this).
I use a nodejs module to process chinese content,instead of the default one,repo is here: https://github.com/codepiano/lunr.js
For the content of the article Chinese, I do Chinese semantics of each article content by Apache Lucene segmentation, the string of text segmentation in example_data.json ( example is "tags"),
Edit on lunr.js
1: lunr.tokenizer function, take punctuation mode based on the segmentation of the Chinese.
2. Lunr.trimmer function, filtering is not for Chinese token.