olivernn / lunr.js

A bit like Solr, but much smaller and not as bright
http://lunrjs.com
MIT License
8.89k stars 548 forks source link

Popularity Boosting/Pagerank #403

Closed indolering closed 5 years ago

indolering commented 5 years ago

I would think this would work work well given the closed ecosystem of a site search. ElasticSearch has a tutorial on this.

olivernn commented 5 years ago

There are two ways of achieving this with lunr currently:

  1. Apply a boost to an entire document at index time, how to calculate the boost is entirely application specific but something similar to the votes described in the ElasticSearch tutorial would be possible.
  2. Apply a boost to the score after Lunr returns query results. The same logic as above would apply.

In the first option this boost would apply to all queries, in the second it could be different with subsequent queries.

Is there a use case that wouldn't be covered by the above options?

I think the approach in the ElasticSearch tutorial makes sense when you can't bring you calculation to the data, but I don't think this is the case with Lunr since it is an in-memory in-process library, rather than a full blown service. I'm happy to be proved wrong though!

indolering commented 5 years ago

No, #1 is totally reasonable. I guess this should be a plugin? I'm kinda sad that Lunr doesn't grok HTML, it would be interesting to add heuristics for headers, etc.

Related note: I'm guessing there aren't any metrics that can be used to benchmark algorithmic enhancements?

indolering commented 5 years ago

Closing for now, will do so when there is a solid HTML processing plugin.