lucaong / minisearch

Tiny and powerful JavaScript full-text search engine for browser and Node
https://lucaong.github.io/minisearch/
MIT License
4.64k stars 133 forks source link

Add a boost to recently updated docs? #223

Closed maccman closed 1 year ago

maccman commented 1 year ago

This is more of a discussion than an issue, but I'm curious if you have any pointers as to how we would boost documents that have been recently edited. Ideally I I would have some kind of decay function based on a date.

lucaong commented 1 year ago

Hi @maccman , this is possible by making use of the boostDocument search option. Here is roughly how I would implement it:

  1. Add a field updatedAt to your documents, represented as an integer timestamp. Add such field to the storeFields list.
  2. When searching, specify the boostDocument option as a function that calculates a boost based on updatedAt. For example, you could use an exponentially decaying function like 1 + Math.exp(-k * timeElapsedSinceUpdate), where you would choose the right value of k depending on your needs. This function will go from 2 when no time elapsed since the last update, to 1 when a long time elapsed, effectively boosting recent documents. Other functions would work too, even a simple "if more recent than 1 day than boost = 1.5, otherwise boost = 1".

Example, assuming that you want to boost documents updated more recently than 1 day ago, with a decaying boosting factor:

const miniSearch = new MiniSearch({
  fields: ['title'],
  storeFields: ['updatedAt']
})

const documents = [
  { id: 1, title: 'Some newer stuff', updatedAt: 1688984149134 },
  { id: 2, title: 'Some older stuff', updatedAt: 1688984074212 },
  # etc.
]

miniSearch.addAll(documents)

miniSearch.search('stuff', {
  boostDocument: (_id, _term, storedFields) => {
    const now = (new Date()).valueOf()
    const daysElapsed = (now - storedFields.updatedAt) / (1000 * 60 * 60 * 24)

    // Significantly boost documents more recent than 1 day, then taper off
    return 1 + Math.exp(-3 * daysElapsed)
  }
})

A few notes:

lucaong commented 1 year ago

@maccman I'll go ahead and close the issue, as I think your question is solved. Feel free to continue the discussion if necessary.