squidfunk / mkdocs-material

Documentation that simply works
https://squidfunk.github.io/mkdocs-material/
MIT License
18.65k stars 3.35k forks source link

Default value for search-plugin separator has a typo #7185

Closed casio888 closed 2 weeks ago

casio888 commented 2 weeks ago

Context

No response

Bug description

We experienced a TypeError in lunr Search on specific letters/search terms, resulting in a broken search bar. After looking into it, we identified a code block that did not get separated correctly into the expandedTerms and was not found in the invertedIndex.

When looking at the default spearator, documented here (https://squidfunk.github.io/mkdocs-material/plugins/search/#config.separator), we saw that ( and ) were explained to be special characters that should be recognized as separators, but the regex did not escape them, so they were skipped.

We escaped them in an override and it worked fine. I think the default separator should be changed to look like this: separator: '[\s\-,:!=\[\]\(\)"/]+|(?!\b)(?=[A-Z][a-z])|\.(?!\d)|&[lg]t;'

Related links

Reproduction

I can't get a minimum reproduction to show the bug, but it happens in our project and the fix works.

Steps to reproduce

Create a code Block (starting with 3 backticks and ending with 3 backticks) with a Function and search for the function name:

from dask_jobqueue import SLURMCluster

cluster = SLURMCluster(queue='test',
  cores=4,
  processes=2,
  project='project',
  memory="8GB",
  walltime="10:23:11")

Search for slurm or slurmcluster

Browser

No response

Before submitting

squidfunk commented 2 weeks ago

Thanks for reporting. I'm not sure why you're getting the behavior you described, but it is the exact same separator we're using on our documentation. Escaping ( and.) inside a character class [...] is not necessary. The only characters that need to be escaped are ], ^ and -, as they are control characters for character classes.

Also, if you run it in the browser's console, you'll see that the separator works:

"cluster = SLURMCluster(queue='test',".split(/[\s\-,:!=\[\]()"/]+|(?!\b)(?=[A-Z][a-z])|\.(?!\d)|&[lg]t;/)
// => ['cluster', 'SLURM', 'Cluster', 'queue', "'test'", '']

If it's still a problem, please create a proper minimal reproduction which we can run. Otherwise, it's working as intended.