readthedocs / readthedocs.org

The source code that powers readthedocs.org
https://readthedocs.org/
MIT License
8.06k stars 3.59k forks source link

Search result ranking improvements #11741

Open mhilbrunner opened 2 weeks ago

mhilbrunner commented 2 weeks ago

What's the problem this feature will solve?

Currently, the search results are ranked suboptimally in some ways.

https://github.com/readthedocs/readthedocs.org/issues/8670 would go a long way in helping with some of the cases we discovered, but the above should (hopefully) be fairly general improvements.

stsewd commented 2 weeks ago

Exact matches, all terms being in the exact order (as opposed to matching only parts, or in different term order) seem to not be valued enough.

Do you have a search example for this case?

Titles and document file names seem to count too little.

These are the boost we set for page title and section titles

https://github.com/readthedocs/readthedocs.org/blob/7198d3d3d628df8873d26b203db54f4c688e52dd/readthedocs/search/faceted_search.py#L271-L272

We can try boosting the page title with 2 as well.