mozilla / elasticutils

[deprecated] A friendly chainable ElasticSearch interface for python
http://elasticutils.rtfd.org
BSD 3-Clause "New" or "Revised" License
243 stars 76 forks source link

Add support for nested bool queries #165

Open OOPMan opened 10 years ago

OOPMan commented 10 years ago

In order to execute the following query, query_raw needs to be used:

{
    "query": {
        "bool": {
            "must": [
                {
                    "multi_match": {
                        "query": "general",
                        "fields": ["first_name.lowercase", "last_name.lowercase", "profession.name.lowercase"],
                        "use_dis_max": false
                    }
                },
                {
                    "multi_match": {
                        "query": "meadowlands",
                        "fields": ["workplace.province.lowercase", "workplace.locality.lowercase"],
                        "use_dis_max": false
                    }
                }
            ]
        }
    },
    "facets": {
        "profession": {
            "terms": {
                "field": "profession.name"
            }
        },
        "province": {
            "terms": {
                "field": "workplace.province"
            }
        },
        "locality": {
            "terms": {
                "field": "workplace.locality"
            }
        }
    }
}

I think this could be resolved by allowing Q instances to be created that refer to nested Q instances

willkg commented 10 years ago

What is it you want to add support for? multi_match?

OOPMan commented 10 years ago

No, the usage of the boolean must in the fashion above. In terms of Python code I had to do this:

query_components = []
if what_or_who != 'all':
    query_components.append({
        'multi_match': {
            'query': what_or_who,
            'fields': ['%s.lowercase' % field for field in ES_WHAT_FIELDS]
        }})
if where != 'all':
    query_components.append({
        'multi_match': {
            'query': where,
            'fields': ['%s.lowercase' % field for field in ES_WHERE_FIELDS]
        }})
if query_components:
    search = search.query_raw({
        'bool': {
            'must': query_components
        }
    })

where I expected to be able to do something like this:

queries = []
# Base query
if what_or_who != 'all':
    queries.append(Q(should=True, **{'%s.lowercase__match' % field: what_or_who.lower() for field in ES_WHAT_FIELDS}))
if where != 'all':
    queries.append(Q(should=True, **{'%s.lowercase__match' % field: where.lower() for field in ES_WHERE_FIELDS})

s = s.query(must=True, *queries)

Am I missing something? Is there a way to do this already?

willkg commented 10 years ago

I'm puzzled. ElasticUtils doesn't support multi-match, so neither of your code fragments will produce a multi-match.

What does the second code fragment produce? Can you do print s._build_query() and copy-paste that in this issue?

OOPMan commented 10 years ago

The second code fragment doesn't work:

>>> q1
<Q should=[('profession.name.lowercase__match', 'dentist'), ('last_name.lowercase__match', 'dentist'), ('first_name.lowercase__match', 'dentist')] must=[] must_not=[]>
>>> q2
<Q should=[('workplace.province.lowercase__match', 'dentist'), ('workplace.locality.lowercase__match', 'dentist')] must=[] must_not=[]>
>>> s = S()
>>> s =s.query(q1, q2, must=True)
>>> s._build_query()
{'query': {'bool': {'should': [{'match': {'profession.name.lowercase': 'dentist'}}, {'match': {'last_name.lowercase': 'dentist'}}, {'match': {'first_name.lowercase': 'dentist'}}, {'match': {'workplace.province.lowercase': 'dentist'}}, {'match': {'workplace.locality.lowercase': 'dentist'}}]}}}

Effectively the query builder flattens the query structure down.

As it is, the equivalent of the following SQL snippet requires the use of a raw query:

WHERE (a = 1 OR b = 2) AND (c = 3 OR d = 4 OR e=5)

OOPMan commented 10 years ago

With regards to the multi_match, that's not important since I'm just using as a shorter way of doing a bool with a should, although I realise I should specify it to use bool and not dis

willkg commented 10 years ago

shoulds, musts and must_nots get flattened, so you can't build a nested set of them. That's the way I implemented it. To date, no one has mentioned needing this and I haven't seen a compelling use case to warrant the added complexity.

I'll fix the title of your issue and let this sit and see if it gathers momentum.

OOPMan commented 10 years ago

Okay, cool. I may fork and play around with getting this to work myself. If I do manage it, I'll let you know :-)

willkg commented 10 years ago

I haven't heard anything about this since August.

@robhudson did have a use case for it, but otherwise it seems no one is interested.

This requires some non-trivial rewriting of things, so I'm going to push it off until post 0.10.

koterpillar commented 10 years ago

I have a use case for it, but OTOH contemplating just switching away from EU in favor of writing raw JSON - we're doing quite complex queries anyway.

OOPMan commented 10 years ago

It is pretty easy to do using query_raw but it also feels like something that should be supported. On 06 Jun 2014 6:16 AM, "Alexey Kotlyarov" notifications@github.com wrote:

I have a use case for it, but OTOH contemplating just switching away from EU in favor of writing raw JSON - we're doing quite complex queries anyway.

— Reply to this email directly or view it on GitHub https://github.com/mozilla/elasticutils/issues/165#issuecomment-45301496 .