mattico / elasticlunr-rs

A partial port of elasticlunr to Rust. Intended to be used for generating compatible search indices.
Apache License 2.0
52 stars 23 forks source link

Add Korean. #50

Closed doosik71 closed 1 year ago

doosik71 commented 1 year ago

Add Korean language module. Stemmer needs implementation. It's simple but better than nothing. Korean stop words came from https://github.com/stopwords-iso/stopwords-ko/blob/master/stopwords-ko.txt

Thanks.

mattico commented 1 year ago

I'd be happy to merge this once CI is passing!

doosik71 commented 1 year ago

Previous commit failed because test files were missing. So I add two test files(tests/data/ko.in.txt and tests/data/ko.out.txt) to pass --all-features test. Korean sentences used in ko.in.txt are came from http://guny.kr/stuff/klorem/ which generates Korean Lorem Ipsum. Thanks.

mattico commented 1 year ago

Thanks!