devinit / DIwebsite-redesign

New DI website 2019
1 stars 1 forks source link

Replace edge ngram tokenizer with ngram to prevent partial matches #1342

Closed akmiller01 closed 1 year ago

akmiller01 commented 1 year ago

Right now a search for "disability" matches "disbursement" because they both begin with "dis". https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html

For example, if the max_gram is 3 and search terms are truncated to three characters, the search term apple is shortened to app. This means searches for apple return any indexed terms matching app, such as apply, approximate and apple.

So I think we should swap this for standard tokenizer, to make behavior much more predictable: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-standard-tokenizer.html

github-actions[bot] commented 1 year ago

Pull reviewers stats

Stats of the last 30 days for DIwebsite-redesign: User Total reviews Time to review Total comments
edwinmp 2 4m)) 0
sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication