datamade / nyc-council-councilmatic

NYC Council version of Councilmatic
MIT License
7 stars 3 forks source link

Improve search relevance by boosting dates #128

Open jeancochrane opened 6 years ago

jeancochrane commented 6 years ago

In #120, we changed the default settings of the search to sort by "relevance" (Solr's internal score for how well a document matches a query) as opposed to by date. This dramatically improves search performance when a user is looking for a specific bill, but when the user is searching by keywords it may actually decrease performance, since "relevant" bills tend to be very old and old bills may not actually be relevant to casual users of the site. (For more background on this problem, see https://github.com/datamade/nyc-council-councilmatic/issues/88.)

To achieve a better balance for relevance, we can try boosting document relevance by the date bills were added. Here are the Haystack docs for boosting:

http://django-haystack.readthedocs.io/en/master/boost.html

And here's an example of a similar pattern we use in Chicago Councilmatic:

https://github.com/datamade/chi-councilmatic/blob/47c5e7d26da1cc6012e0ae7831f499e8be35405a/chicago/search_indexes.py#L21-L26