slashdotdash / jekyll-lunr-js-search

[UNSUPPORTED] Jekyll + lunr.js = static websites with powerful full-text search using JavaScript
MIT License
548 stars 109 forks source link

Lunr 2.1 update & configurable indexed and template fields #118

Open nkuehn opened 7 years ago

nkuehn commented 7 years ago

Hi @slashdotdash starting from my issue #117 I found that a migration to lunr 2.0.x can reduce the index size much more massively, so I started developing and testing a bit.
Here's the result. It required a bit of shuffling around in the indexer because the new lunr index is immutable so we need to go via the Builder class if we don't want to store the complete documents in ruby memory in parallel.

to everyone: please test, this project has no built-in tests so we need a bunch of feeback from real-world sites.

slashdotdash commented 7 years ago

Awesome work @nkuehn. Thanks for taking the time to get this done. I'll test it out locally and get it merged in and released.

nkuehn commented 7 years ago

Better stop testing in depth - while researching the weird behavior of the results I stumbled over https://github.com/olivernn/lunr.js/issues/263 , there learning that field boosting was moved to query time in the new index structure.

It's an improvement, but leads to no field being boosted at all now, esp. the title not playing any role.

I'll have to touch the client code, too as it looks. Alternatively wait for lunr.js 2.1, which introduces per-field vectors in the index and behaves pretty good without any boosting at all.

nkuehn commented 7 years ago

@slashdotdash Lunr.js has released 2.1 to production now ( https://github.com/olivernn/lunr.js/commit/cf96052b82426eb84302b64797e498aabb681e59 ) and I am pretty happy with the results I see, especially in comparison to the 2.0.x series.

So I'm skipping 2.0.x altogether for this upgrade. I have it in use on our site and am happy with the stability, but haven't actively played with other configurations (lack of available sites to test with).

The key changes here are:

Client side was deliberately kept compatible although it would be nice to support some more of the query language features and make it easier to integrate into a bigger site as a JS dependency.

slashdotdash commented 7 years ago

@nkuehn Sorry I haven't made time to merge your pull request.

Would you be interested in becoming a contributor to this project so that you can merge PRs yourself?

nkuehn commented 6 years ago

Hi @slashdotdash It's pretty sure now that I won't be contributing any more, so that won't help - I have not been able to tune the underlying lunr.js good enough to match the use case and content size of the site that's driving my motivation here. We switched to a SaaS search offering now.

slashdotdash commented 6 years ago

@nkuehn No problem, thanks for letting me know.