elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.19k stars 24.84k forks source link

fuzziness not working in conjunction with bool_query during multi_match #56229

Open sachin-frayne opened 4 years ago

sachin-frayne commented 4 years ago

Elasticsearch version: 7.6.2

Describe the feature:

When I search in the fields created as part of the search_as_you_type dataType, I would like some fuzziness leniency, however I am not seeing this. See steps to reproduce below for full examples.

TL;DR:

  1. I would like beraking to match documents with breaking, via the standard field, with a fuzziness of 1 and a bool_prefix query type.
  2. I would also like documents with berak to match breaking, via the _index_prefix field, with a fuzziness of 1 and a bool_prefix query type.

Steps to reproduce:

PUT index
{
  "mappings": {
    "properties": {
      "field": {
        "type": "search_as_you_type"
      }
    }
  }
}

PUT index/_doc/1
{
  "field": "breaking"
}

GET index/_search
{
  "query": {
    "multi_match": {
      "query": "beraking",
      "type": "bool_prefix",
      "fuzziness": 1,
      "fields": [
        "field",
        "field._2gram",
        "field._3gram",
        "field._index_prefix"
      ]
    }
  }
}

GET index/_search
{
  "query": {
    "multi_match": {
      "query": "berak",
      "type": "bool_prefix",
      "fuzziness": 1,
      "fields": [
        "field",
        "field._2gram",
        "field._3gram",
        "field._index_prefix"
      ]
    }
  }
}

Additional Notes:

With query 1, it starts working when "type": "bool_query" is removed but then the bool_query nature is no longer preserved. i.e.

GET index/_search
{
  "query": {
    "multi_match": {
      "query": "beraking",
      "fuzziness": 1,
      "fields": [
        "field",
        "field._2gram",
        "field._3gram",
        "field._index_prefix"
      ]
    }
  }
}
elasticmachine commented 4 years ago

Pinging @elastic/es-search (:Search/Search)

telendt commented 4 years ago

To be fair this is documented (both in bool_prefix multi-match query and match_bool_prefix query):

The fuzziness, prefix_length, max_expansions, rewrite, and fuzzy_transpositions parameters are supported for the terms that are used to construct term queries, but do not have an effect on the prefix query constructed from the final term.

-- https://www.elastic.co/guide/en/elasticsearch/reference/7.x/query-dsl-multi-match-query.html

The fuzziness, prefix_length, max_expansions, fuzzy_transpositions, and fuzzy_rewrite parameters can be applied to the term subqueries constructed for all terms but the final term. They do not have any effect on the prefix query constructed for the final term.

-- https://www.elastic.co/guide/en/elasticsearch/reference/7.x/query-dsl-match-bool-prefix-query.html

But honestly I also don't understand this decision.

ankane commented 4 years ago

Want to voice my support for this as well. It'd be nice if there was an option to do a term match with fuzziness on the final term (in addition to the prefix query with an OR condition). Otherwise, search as you type doesn't find misspellings in the final term. It's most noticeable with single word queries, where misspellings are never found.

nielskrijger commented 2 years ago

As a workaround I'm adding a space at the end of the search query, that way it does appear to apply fuzziness on the last word.

I'm not sure what the downsides of doing this are, I've just been experimenting on a test dataset and am reasonably happy with the results.

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-search (Team:Search)

elasticsearchmachine commented 4 months ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)