mattico / elasticlunr-rs

A partial port of elasticlunr to Rust. Intended to be used for generating compatible search indices.
Apache License 2.0
52 stars 23 forks source link

Add Arabic #40

Closed abdnh closed 2 years ago

abdnh commented 2 years ago

This adds a minimal Arabic stemmer based on https://github.com/MihaiValentin/lunr-languages/blob/master/lunr.ar.js

I originally wrote this for use with mdBook. It's not a full port of the JS implementation, as I found many things in the original code to make search quality worse for my use case. If there is a requirement to be fully compatible with lunr-languages, I think I can work on a full port sometime.

mattico commented 2 years ago

Are you using a modified version of the JS code to execute the searches as well?

abdnh commented 2 years ago

Yes, you can see it here: https://github.com/abdnh/mdBook/commit/1d537b5f6798c95e27f1f96c0c7dc08bb0f40715#diff-5d9c3e8a9c6b09e2f155e0820b2357172db3886542d2f1a3aa8cfcc46d5b26a6

I use it in this site: https://www.abdnh.net/anki-manual/

mattico commented 2 years ago

So that's a bit tricky because I don't think it makes sense to support generating search indexes that can't be used. We need to at least point to a compatible JS implementation for people to use for each language.

I think what we should actually do is to include the JS in this repository so that we are free to modify it and can run integration tests. Upstream elasticlunr is basically unmaintained at this point, so I suppose we should do it. I'll try to work on this soon.

mattico commented 2 years ago

Sorry for the wait!

I'm going to release a 3.0.0 version with our suggested JS included in the repository.