elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.05k stars 24.51k forks source link

Re-think integration of the common-grams filter #31427

Open jpountz opened 6 years ago

jpountz commented 6 years ago

The goal of the common grams filter is to speed up phrase queries on common words. For that purpose, it adds shingles to the token stream when it encounters common words (as provided in a list). Then users need to configure a search quote analyzer that replaces the common-grams filter with a common-grams filter that has query_mode set to true. (See https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-common-grams-tokenfilter.html)

This is not a great user-experience, I think we should at least make it possible for the search quote analyzer to automatically pick up the query_mode: true setting, and maybe even deprecate this token filter from our docs and integrate it differently, similarly to what we did with the index_phrases option (https://github.com/elastic/elasticsearch/pull/30450).

For instance, one idea could be to allow users to make the index_phrases option less space-intensive by adding a common_words option and only generating shingles for this set of common words. The default would be to generate shingles for ever token, like we do today.

elasticmachine commented 6 years ago

Pinging @elastic/es-search-aggs

elasticsearchmachine commented 2 months ago

Pinging @elastic/es-search (Team:Search)

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)