squidfunk / mkdocs-material

Documentation that simply works
https://squidfunk.github.io/mkdocs-material/
MIT License
20.87k stars 3.53k forks source link

Search displays unrelated links #262

Closed ovasquez closed 7 years ago

ovasquez commented 7 years ago

Description

Since the change in the search behavior to highlight the matching words I've noticed a couple of glitches the search:

The image below shows the search showing several ## headings that don't contain "tabs" (neither in the displayed text or the content). material-mkdocs-1

The image below shows a document with no trace of the word "tabs" that is being listed in the search results material-mkdocs-2

Expected behavior

The search should show only related subsections of related documents.

Actual behavior

Search shows all '##' headings for a document in the search results, and sometimes unrelated documents are shown.

Steps to reproduce the bug

  1. Go to the material documentation website
  2. Search the word "Tabs"
  3. Review the search results

Package versions

squidfunk commented 7 years ago

Two things:

  1. lunr.js does stemming by default, so tabs is reduced to tab, which matches table in the Disqus integration section, just as an example. The question is if stemming makes sense here, but this is a general question.

  2. The highlighting isn't ideal - sometimes the matching word is located later in the text. I would also like to see a summary here like on Google which highlights the matched words, but this gets complex very fast and probably isn't generalizable to all languages. I prototyped something that breaks of the sentence structure and tries to make a summary, but to no extent I think is robust enough. For this reason I decided to do it like this (for now). If you have a good idea or know a library that does summarization in a useful way, feel free to post.

The search functionality is a huge ongoing process. I think it's fairly good now, because the presentation of results is better and results provide more context than for example the readthedocs or mkdocs original theme search. However, I'm happy for input on making it even better.

ovasquez commented 7 years ago

I was unaware of the stemming, so that would make the results valid. I understand that showing the chunks of text where the match was found is not currently supported by lunr.js, so it makes sense as it currently is displaying it.

Thanks for the detailed explanation about the search.

squidfunk commented 7 years ago

If something better for summarization comes up, I'll definitely include it.