sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.58k stars 2.12k forks source link

Exact search in Sphinx #3301

Open impulkit opened 7 years ago

impulkit commented 7 years ago

Subject:

Problem

Environment info

tk0miya commented 7 years ago

AFAIK, there are no way to do that.

TimKam commented 7 years ago

Currently, the search index maps single words to .rst/html files.

For example, the search index "knows" that:

show occurs in file 1,2 and 5, whereas ipsec occurs in file 1 and 2,

It knows little more.

If one wants to support exact searches for multiple-word strings, one needs to add the position of each word to the search index.

If this doesn't increase the index size too much, it might be practicable approach.

@tk0miya Do you think it is worth to look further into this?

dlmurphy commented 6 years ago

Dataverse uses Sphinx (version 1.5.6) for its documentation, and we're very interested in allowing our users to do this kind of exact phrase searching. It would make it much easier for our users to quickly find exactly the information they're looking for in our documentation. For the record, we're also interested in boolean search operators like AND, OR, and NOT, but this exact phrase searching is most important for us.

Here is our issue on the subject in our repo: https://github.com/IQSS/dataverse/issues/4884

Here is an example of our use case:

One of our users searched for "terms of use" in quotations. This is what happens when you attempt this query in our current sphinx search:

43105328-79350ef0-8ea2-11e8-81be-b054930d4d2a

As you can see, this currently returns a useless list of all pages that use the words "use", "user", "used", etc. However, there are plenty of relevant sections about Terms of Use on our guides that are being drowned out. When I search "terms of use" using quotes on my local copy of the guides using SublimeText's search feature, I get these useful results:

43105408-c40d4e10-8ea2-11e8-8ead-b97bbd4f3b25

An exact phrase search would go a long way for cases like this.

abulhol commented 2 years ago

👍