Closed andymcm closed 6 years ago
Interesting find! You could probably squeeze a bit more performance out of this method by replacing the Array#reduce
with a plain for
loop too.
Adding a simple performance test for this one case would be a great help too. You can take a look at the lunr.Vector
performance test for an example. To run the performance test run make perf/pipeline_perf.js
.
Thanks for the feedback. I've changed to a for loop and added a pipeline performance test. Feel free to edit further while merging.
Thanks for updating this, I've pulled these changes into master, will put together a release shortly.
2.1.6 contains your performance fix.
Hello !
Does this improvement have an impact on the search speed or only on the index building speed ?
I use GitbookIO/plugin-lunr in a gitbook used as a reminder. I have a lot of markdown files there and some of them have lots of tokens. I wonder if the gitbook team merge GitbookIO/plugin-lunr#6 with lunr@2.1.6
i will see only a performance on the gitbook building process or if i will get also search execution performance boost.
I think technically the change affects both search and indexing pipelines, but in practice it will only make a difference to indexing, not search (unless you have very unusual search behaviour)
It certainly won't make search any slower! Although a lunr.Pipeline
is part of the search path it usually processes many fewer tokens than when building indexes. Any performance improvements from the pipeline are still likely to be dwarfed by the time spent actually performing the search. As ever though, running benchmarks relevant to your use case/data set is the only way to know for sure.
First, thanks for this library, I'm finding it extremely useful.
I was seeing poor performance adding very large fields to an index, which I tracked down to an array concat operation in the pipeline. Here are some sample numbers from my environment:
Standard Lunr:
With this fix:
Regards
Andy