olivernn / lunr.js

A bit like Solr, but much smaller and not as bright
http://lunrjs.com
MIT License
8.89k stars 548 forks source link

Trimmer is missing from search pipelines #532

Open dhdaines opened 1 week ago

dhdaines commented 1 week ago

Discovered in lunr.py: https://github.com/yeraydiazdiaz/lunr.py/issues/151 - but the same issue (and a similar workaround) exists in lunr.js. As noted in the example it is actually a pretty serious problem:

const lunr = require("lunr");
const index = lunr(function() {
    this.field("title");
    this.field("body");
    this.add({
        title: "To be or not to be?",
        body: "That is the question!",
    });
});
// Should print something, but doesn't!
console.log(index.search("What is the question?"));
dhdaines commented 1 week ago

And just to help anyone who runs into this problem (unless a new release of lunr.js happens which appears unlikely) the workaround is simple (though not as clear as it is in Python...):

const lunr = require("lunr");
const index = lunr(function() {
    this.use(function(builder) {
        builder.searchPipeline.before(lunr.stemmer, lunr.trimmer);
    });
    this.field("title");
    this.field("body");
    this.add({
        title: "To be or not to be?",
        body: "That is the question!",
    });
});
console.log(index.search("What is the question?"));
dhdaines commented 1 week ago

Edited the above because adding the stopword filter to the search pipeline actually isn't useful (probably why the code is the way it is?) - if the terms aren't in the index they just won't get found, obviously.