jurismarches / luqum

A lucene query parser generating ElasticSearch queries and more !
Other
187 stars 42 forks source link

Question: Is it possible to allow `minimum_should_match` on all OR operations? #45

Closed ztawil closed 4 years ago

ztawil commented 4 years ago

Is it possible to force having a minimum_should_match value on any bool values when an OR operation is present?

Currently I have something like:

search_content = "(a OR b) AND (b OR c)"
tree = parser.parse(search_content)
query = ES_BUILDER(tree)

And the query yields:

{
  "bool": {
    "must": [
      {
        "bool": {
          "should": [
            {
              "match": {
                "content": {
                  "query": "a",
                  "zero_terms_query": "none"
                }
              }
            },
            {
              "match": {
                "content": {
                  "query": "b",
                  "zero_terms_query": "none"
                }
              }
            }
          ]
        }
      },
      {
        "bool": {
          "should": [
            {
              "match": {
                "content": {
                  "query": "b",
                  "zero_terms_query": "none"
                }
              }
            },
            {
              "match": {
                "content": {
                  "query": "c",
                  "zero_terms_query": "none"
                }
              }
            }
          ]
        }
      }
    ]
  }
}

Whereas I'd like it to be:

{
  "bool": {
    "must": [
      {
        "bool": {
          "minimum_should_match": 1,
          "should": [
            {
              "match": {
                "content": {
                  "query": "a",
                  "zero_terms_query": "none"
                }
              }
            },
            {
              "match": {
                "content": {
                  "query": "b",
                  "zero_terms_query": "none"
                }
              }
            }
          ]
        }
      },
      {
        "bool": {
          "minimum_should_match": 1,
          "should": [
            {
              "match": {
                "content": {
                  "query": "b",
                  "zero_terms_query": "none"
                }
              }
            },
            {
              "match": {
                "content": {
                  "query": "c",
                  "zero_terms_query": "none"
                }
              }
            }
          ]
        }
      }
    ]
  }
}

But perhaps I'm doing something wrong?

wouterweerkamp commented 4 years ago

Given that both bool (OR) queries only have should clauses, Elasticsearch uses a minimum_should_match of 1 by default. Check https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html#bool-min-should-match

If the bool query includes at least one should clause and no must or filter clauses, the default value is 1. Otherwise, the default value is 0.

So the expected output you gave is in practice the same as the one that is generated.

alexgarel commented 4 years ago

Hi @ztawil as told by @wouterweerkamp you probably don't need minimum_should_match. However I added some support for passing options to operations. If you need it, you should subclass ElasticsearchQueryBuilder and specialize _should_operation to pass options.