readthedocs / addons

JavaScript client to integrate with Read the Docs nicely
https://readthedocs-addons.readthedocs.io/
MIT License
21 stars 6 forks source link

Search: fuzzy checkbox UI element #84

Open astrojuanlu opened 3 years ago

astrojuanlu commented 3 years ago

I searched for "context" but started getting results for "content", which was confusing.

Screenshot 2021-09-28 at 13-47-58 Search — Read the Docs 5 24 0 documentation

stsewd commented 3 years ago

This is because our project has the fuzzy search flag enabled. We aren't checking for spelling errors, we are just doing a fuzzy search if you search a "single term". You can use "context" to get an exact search.

astrojuanlu commented 3 years ago

Thanks for the explanation on the root cause @stsewd ! However I wonder if this is good UX. I was confused, and so extrapolating from N=1, I wonder if more people would get confused.

stsewd commented 3 years ago

This was actually so people aren't frustrated if they made a typo or don't know the exact word. This is controlled via https://github.com/readthedocs/readthedocs.org/blob/f0da2a478705c794ce046d67a3eededf5595fd33/readthedocs/search/faceted_search.py#L178-L184 https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#fuzziness

We were also exploring into having a UI for displaying the search options. So, not sure, personalty I like the default fuzzy and prefix search, and use " if I need to search something specific (rarely I need to use quotes really)

astrojuanlu commented 3 years ago

I see how users might get frustrated if they made a typo, or if they used a different spelling (British vs American for example). So doing the fuzzy search is a great addition.

However, for cases in which the user didn't make a typo:

1) It is weird to see matches of words you're not interested in, and 2) There is no way to discover how to look for an exact match (yes, using double quotes)

So, I think it would be cool to have an admonition on top of the search page saying something like

Performing a fuzzy match on the backend, use "{term}" to do an exact match. Read more about the search syntax and how Read the Docs enhances search on your project.

stsewd commented 3 years ago

+1 on adding that admonition in our search extension

stsewd commented 3 years ago

We could also inject that message in search page without requiring our search extension, but not sure about changing that content in behalf of users.

humitos commented 3 years ago

I think search.html vs modal solves different UX issues --and both are good! 👍🏼

Adding it on the modal communicates the extra options before the user searches for a term. Also, if the user doesn't hit enter on that modal, they will never see the search.html page.

On the search.html works for the case that the user searches for content and then realize that it's showing context results as well, so they want to force it to be exact after searching for the term.

In any case, I think it would also be good to have an icon or something in the search box in the left (in our theme) that can hover/click and go directly to the documentation.

nienn commented 3 years ago

Performing a fuzzy match on the backend, use "{term}" to do an exact match. Read more about the search syntax and how Read the Docs enhances search on your project.

Adding info here is a good idea! I would also like to see the words that are being searched appear somewhere. The admonition is a great but takes long to read. Adding the searched words, besides the admonition, could be a chance to show the same idea in a visual manner and in a more intuitive way.

Maybe each word could even have an (x) nearby that the user could click to exclude that word?

astrojuanlu commented 3 years ago

I don't know how the fuzzy search works, but maybe we don't have the list of "extra words" that are being searched? I'll let others clarify that though.

stsewd commented 3 years ago

@astrojuanlu yeah, not sure if we have the list (but we can extract them from the matches), but we also do a prefix search (words that start with the current search), so the list could get pretty large.

nienn commented 3 years ago

but we also do a prefix search (words that start with the current search), so the list could get pretty large.

We don't need to display the full list, just the 1st N search terms.

humitos commented 1 year ago

This would be a really good UI element (Fuzzy checkbox) in the search as you type addon. I'm moving this issue over there.

Performing a fuzzy match on the backend, use "{term}" to do an exact match. Read more about the search syntax and how Read the Docs enhances search on your project.

This is tracked in https://github.com/readthedocs/addons/issues/33

humitos commented 9 months ago

@stsewd if I understand correctly, we can control the fuzzy at search time (when the user performs the query), based on this code: https://github.com/readthedocs/readthedocs.org/blob/83805a643d56764f893ff65f764bf993c74fb582/readthedocs/search/faceted_search.py#L115-L130

It seems we can have a Fuzzy checkbox in our UI the user can enable/disable and perform the query with/without using fuzziness. Besides, we can allow project authors to define the "default value" for that checkbox from the addons project admin's page.

Am I correct? What do you think?

stsewd commented 9 months ago

yeah, we can control that at search time, it's under a feature flag, since it can be slow

humitos commented 9 months ago

OK. Then, the work required here would be:

Anything else you have in mind?

stsewd commented 9 months ago

I think we should still have it under a feature flag until we prioritize optimizing ES.

humitos commented 9 months ago

@stsewd well, the goal of this issue is to remove the feature flag and expose the feature to users.

What does it mean it can be slow? Will that kill our servers? Can we add a timeout to ES queries as we do with PostgreSQL?

humitos commented 9 months ago

It seems we can specify a timeout on each query, https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout, but also there is a search.default_search_timeout that we can set in the cluster. We should use that option.

humitos commented 8 months ago

It seems we can specify a timeout on each query, elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout, but also there is a search.default_search_timeout that we can set in the cluster

This would be also useful for the case described in https://github.com/readthedocs/readthedocs.org/issues/10321 👍🏼

humitos commented 8 months ago

@stsewd where does live the configuration of Elastic for production? How I can define the search.default_search_timeout? Do we have this in terraform?

I have the same questions for our development instance. Reading the docs, I didn't get where should I define these settings. Do you know how to do this?

stsewd commented 8 months ago

Looks like that's a search option, so probably somewhere in search/faceted_search.py. But I don't think we should expose this to all users yet. If the timeout makes the feature unusable, don't think we should expose a broken feature.