lucaong / minisearch

Tiny and powerful JavaScript full-text search engine for browser and Node
https://lucaong.github.io/minisearch/
MIT License
4.67k stars 133 forks source link

Can I multiply the boost by a score in a field? #212

Closed reubenfirmin closed 1 year ago

reubenfirmin commented 1 year ago

Let's say I have document:

doc {
   content: string;
   baseScore: int;
}

I could post-process the results and reorder by multiplying in baseScore, but is there a way to do that as part of the query?

Or, alternately, could I do something like:

doc {
   content: string;
   primary: boolean;
}

...and tell minisearch to boost by X where primary==true?

The one way I can think of to do this is to:

doc {
   content: string;
   forfiltering: string;
}

Then when I index, I set forfiltering to some unique string, e.g. "thzzz", include that term in all of my queries, and boost forfiltering. But that's hacky, so I'm looking for something more elegant.

lucaong commented 1 year ago

Hi @reubenfirmin , yes, there is a clean way to boost specific documents based on the value of a field, by using the boostDocument search option:

// Say that your documents look like this:
const documents = [
  {
    id: 1,
    text: 'Some example text',
    boostFactor: 1.5
  },
  {
    id: 2,
    text: 'Some other example text',
    boostFactor: 0.9
  },
  {
    id: 3,
    text: 'Some more example text',
    boostFactor: null
  }
]

// Keep a map of documents by their ID:
const docById = documents.reduce((byId, doc) => {
  byId[doc.id] = doc
  return doc
}, {})

// Initialize MiniSearch as usual
const miniSearch = new MiniSearch({ fields: ['text'] })

// Upon search, use the `boostDocument` search option to specify a dynamic boosting factor:
miniSearch.search('example', {
  boostDocument: (docId) => docById[docId].boostFactor || 1
})

A boost of 1 is neutral, a boost > 1 increases the score of a result, and < 1 decreases it.

reubenfirmin commented 1 year ago

Awesome! Does this run over the results prior to returning them?

Any chance you could extend this by giving the option to pass the SearchResult itself to the function? I have too many documents to keep a separate map of them, so I'm going to be looking them up in minisearch, which seems redundant.

That way I could store fields that get passed in at the time of sorting; e.g.

   boostResult: (result) => (result.special) ? 2 : 1
lucaong commented 1 year ago

At the moment, only the document ID and the matched term are passed to the boostDocument callback, but I will soon make a new release that will make the storeFields fields available. You will soon be able to do something like this:

// Initialize MiniSearch as usual
const miniSearch = new MiniSearch({
  fields: ['text'],
  storeFields: ['text', 'boostFactor']
})

// Upon search, use the `boostDocument` search option to specify a dynamic boosting factor:
miniSearch.search('example', {
  boostDocument: (docId, term, storedFields) => storedFields.boostFactor || 1
})

This pull request implements the feature.

lucaong commented 1 year ago

Version v6.1.0 is published on NPM, including this feature. I am therefore closing this issue, but feel free to add further comments if necessary.

reubenfirmin commented 1 year ago

Thanks! This is great.