Open astrojuanlu opened 3 years ago
This is because our project has the fuzzy search flag enabled. We aren't checking for spelling errors, we are just doing a fuzzy search if you search a "single term". You can use "context"
to get an exact search.
Thanks for the explanation on the root cause @stsewd ! However I wonder if this is good UX. I was confused, and so extrapolating from N=1, I wonder if more people would get confused.
This was actually so people aren't frustrated if they made a typo or don't know the exact word. This is controlled via https://github.com/readthedocs/readthedocs.org/blob/f0da2a478705c794ce046d67a3eededf5595fd33/readthedocs/search/faceted_search.py#L178-L184 https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#fuzziness
We were also exploring into having a UI for displaying the search options. So, not sure, personalty I like the default fuzzy and prefix search, and use "
if I need to search something specific (rarely I need to use quotes really)
I see how users might get frustrated if they made a typo, or if they used a different spelling (British vs American for example). So doing the fuzzy search is a great addition.
However, for cases in which the user didn't make a typo:
1) It is weird to see matches of words you're not interested in, and 2) There is no way to discover how to look for an exact match (yes, using double quotes)
So, I think it would be cool to have an admonition on top of the search page saying something like
Performing a fuzzy match on the backend, use
"{term}"
to do an exact match. Read more about the search syntax and how Read the Docs enhances search on your project.
+1 on adding that admonition in our search extension
We could also inject that message in search page without requiring our search extension, but not sure about changing that content in behalf of users.
I think search.html
vs modal solves different UX issues --and both are good! 👍🏼
Adding it on the modal communicates the extra options before the user searches for a term. Also, if the user doesn't hit enter on that modal, they will never see the search.html
page.
On the search.html
works for the case that the user searches for content
and then realize that it's showing context
results as well, so they want to force it to be exact after searching for the term.
In any case, I think it would also be good to have an icon or something in the search box in the left (in our theme) that can hover/click and go directly to the documentation.
Performing a fuzzy match on the backend, use "{term}" to do an exact match. Read more about the search syntax and how Read the Docs enhances search on your project.
Adding info here is a good idea! I would also like to see the words that are being searched appear somewhere. The admonition is a great but takes long to read. Adding the searched words, besides the admonition, could be a chance to show the same idea in a visual manner and in a more intuitive way.
Maybe each word could even have an (x) nearby that the user could click to exclude that word?
I don't know how the fuzzy search works, but maybe we don't have the list of "extra words" that are being searched? I'll let others clarify that though.
@astrojuanlu yeah, not sure if we have the list (but we can extract them from the matches), but we also do a prefix search (words that start with the current search), so the list could get pretty large.
but we also do a prefix search (words that start with the current search), so the list could get pretty large.
We don't need to display the full list, just the 1st N search terms.
This would be a really good UI element (Fuzzy
checkbox) in the search as you type addon. I'm moving this issue over there.
Performing a fuzzy match on the backend, use
"{term}"
to do an exact match. Read more about the search syntax and how Read the Docs enhances search on your project.
This is tracked in https://github.com/readthedocs/addons/issues/33
@stsewd if I understand correctly, we can control the fuzzy at search time (when the user performs the query), based on this code: https://github.com/readthedocs/readthedocs.org/blob/83805a643d56764f893ff65f764bf993c74fb582/readthedocs/search/faceted_search.py#L115-L130
It seems we can have a Fuzzy
checkbox in our UI the user can enable/disable and perform the query with/without using fuzziness. Besides, we can allow project authors to define the "default value" for that checkbox from the addons project admin's page.
Am I correct? What do you think?
yeah, we can control that at search time, it's under a feature flag, since it can be slow
OK. Then, the work required here would be:
Fuzzy
checkbox in the UI?fuzzy=true|false
attribute to the endpoint?fuzzy
attribute?fuzzy
attributeAnything else you have in mind?
I think we should still have it under a feature flag until we prioritize optimizing ES.
@stsewd well, the goal of this issue is to remove the feature flag and expose the feature to users.
What does it mean it can be slow? Will that kill our servers? Can we add a timeout to ES queries as we do with PostgreSQL?
It seems we can specify a timeout on each query, https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout, but also there is a search.default_search_timeout
that we can set in the cluster. We should use that option.
It seems we can specify a timeout on each query, elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout, but also there is a
search.default_search_timeout
that we can set in the cluster
This would be also useful for the case described in https://github.com/readthedocs/readthedocs.org/issues/10321 👍🏼
@stsewd where does live the configuration of Elastic for production? How I can define the search.default_search_timeout
? Do we have this in terraform?
I have the same questions for our development instance. Reading the docs, I didn't get where should I define these settings. Do you know how to do this?
Looks like that's a search option, so probably somewhere in search/faceted_search.py. But I don't think we should expose this to all users yet. If the timeout makes the feature unusable, don't think we should expose a broken feature.
I searched for "context" but started getting results for "content", which was confusing.