Closed mobinseven closed 9 years ago
It is possible to use lunr.js with other languages, take a look at https://github.com/MihaiValentin/lunr-languages for plugins providing support for other languages. Currently it does not have any support for Farsi.
There are two parts that make a language extension, a stemmer and a stop word filter. By default lunr comes with an English stemmer and stop word filter. At a guess I'd say that these are being very confused by Farsi!
Ideally you would create a Farsi stemmer and stop word filter and put them together into a plugin, take a look at some of the implementations in https://github.com/MihaiValentin/lunr-languages for some ideas on how to do this.
Alternatively you should be able to remove then English stemmer and stopword filter and hopefully get some better results.
var idx = lunr(function () {
this.pipeline.reset()
this.field('fieldname')
})
Let me know if you manage to come up with a plugin for Farsi, if you do it'd be great to get it added to the https://github.com/MihaiValentin/lunr-languages project for others to use also.
I dont know how to generate those stemmers and they didnt provide good documentation. can you give me some advise?
How to use a custom tokenizer method which is for another language? I have a method which returns words like this: //دیوان //اشعار //شامل //غزلیات //قصیده //مثنوی //قطعات //رباعیات how to integrate such method with lunr?