Closed olivernn closed 7 years ago
I've tried it with the french language and it works good. It would be nice if it could be merged.
Thanks @olivernn for this PR. It helped me understand the upcoming changes in Lunr 2.
So, I took the insight from here and overhauled Lunr Languages to be compatible with all Lunr versions (0.6.0
, 0.7.0
, 1.0.0
, 2.0.0-alpha.5
). The code is now in master
and I bumped Lunr Languages to version 1.0.0
This will help users, since no matter what Lunr version they will use, they'll just to have to make sure they use the latest Lunr Languages version.
In order to enforce this, I added integration tests that test the combination between Lunr versions X
Lunr Languages languages.
In this way, we'll achieve two things:
I will close this MR now.
Here's the commit in which these changes were made: https://github.com/MihaiValentin/lunr-languages/commit/4c64ac618e5c89868c0755761cb6f510d0a74d91 . The key changes were made in:
lunr.template
- forward-compatibility to Lunr 2test/testdata/<languages>.js
- testcases for all the languagestest/VersionsAndLanguagesTest.js
- the test that tests all Lunr versions with all the languages testscaseslunr.jp.de
(this is not generated from lunr.template
) - support for the Japanese tokenizer across all Lunr versionslunr.multi.js
- the multi language support also required using the searchPipeline
for correctly stemming the search terms in Lunr 2lunr.stemmer.support.js
- forward-compatibility to Lunr 2Should you have any questions, please comment on the commit linked above, or let's talk in Gitter.
@iDams , you can now use it :).
💯 Nice work @MihaiValentin!
This commit adds support for the upcoming release of lunr 2.x. This has not been released yet so its probably best off waiting to merge this until I do that release. Ideally I'd like to be able to have the same great language support in the new Lunr from day one though, so getting this out here now to get feedback.
Lunr has changed the interface for pipeline functions. Before, tokens were strings passed to pipeline functions, Lunr 2.x changes this, wrapping them in a
lunr.Token
object. This means that all pipeline functions that expect to be working with a string need to be updated to work with alunr.Token
.This change covers most of the language plugins in this repository. The Japanese plugin required a few more changes, specifically to make use of the new, per index, tokeniser. This should allow a Japanese and non-Japanese indexes to coexist.
There is one potential issue though, searches are now parsed by
lunr.QueryParser
which expects terms to be whitespace separated. I don't know enough (none) Japanese to get from the demos if this is an issue or not, perhaps someone can lend a hand here.I have not changed any of the versions etc, I don't know what you want to do here. I'd like to suggest still keeping support for the 0.x and 1.x branches of Lunr, as well as the new interfaces in Lunr 2.x. Perhaps lune-languages could have a similar versioning scheme, to indicate which major version of Lunr is supported. I'm open to ideas here though.