MihaiValentin / lunr-languages

A collection of languages stemmers and stopwords for Lunr Javascript library
Other
432 stars 161 forks source link

Search indexing for Chinese language (lunr.zh) does not work with multi-language #89

Open blackwidow207 opened 2 years ago

blackwidow207 commented 2 years ago

lunr.zh handles things a little differently for spaces, so when used with another language for multi-language support the words in a sentence are all thrown together into a single indexed term, so users cannot search for a word. (Bonus feature searching for an entire sentence works though 🤣)

knubie commented 2 years ago

See #45

blackwidow207 commented 2 years ago

See #45

Thanks! I will give it a try. I expanded on the existing unit tests for multi language testing and found the same issue with Japanese Thai and Chinese, hopefully this will solve it for all 3 🤞

1921Aaron commented 1 year ago

See #45

Thanks! I will give it a try. I expanded on the existing unit tests for multi language testing and found the same issue with Japanese Thai and Chinese, hopefully this will solve it for all 3 🤞

Is it resolved?

czy88840616 commented 1 year ago

just use it, works well.

  this.use(lunr.multiLanguage('en', 'zh'));
  this.tokenizer = function (x) {
    return lunr.tokenizer(x).concat(lunr.zh.tokenizer(x));
  };

  this.ref('id');
  this.field('title');
  this.field('body');

// ...