aparo / pyes

Python connector for ElasticSearch - the pythonic way to use ElasticSearch
BSD 3-Clause "New" or "Revised" License
607 stars 270 forks source link

search_multi does not work #434

Open imhoffd opened 10 years ago

imhoffd commented 10 years ago

Normal search() is working just fine for any one of these queries, but search_multi() just doesn't work. I printed the curl request before it was sent (around here https://github.com/aparo/pyes/blob/master/pyes/es.py#L444) which printed this:

curl -XGET http://localhost:9200/client_3/_msearch -d '""
{"query": {"bool": {"must": [{"match_all": {}}]}}}
""
{"query": {"bool": {"must": [{"match_all": {}}]}}}
""
{"query": {"bool": {"must": [{"match_all": {}}]}}}
'

That request does not correspond to the Multi Search API of Elasticsearch: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-multi-search.html

Elasticsearch is returning:

{
    "error": null
}

for simple search_multi usage:

queries = queries[:3]
print(queries)
print(queries[0].serialize())
print(queries[1].serialize())
print(queries[2].serialize())

es.search_multi(queries=queries)
for r in rs:
    pass  # do something

I get this printed:

[<pyes.query.BoolQuery object at 0x2b42906fed90>, <pyes.query.BoolQuery object at 0x2b42906fe190>, <pyes.query.BoolQuery object at 0x2b42906fe1d0>]
{'bool': {'must': [{'match_all': {}}]}}
{'bool': {'must': [{'match_all': {}}]}}
{'bool': {'must': [{'match_all': {}}]}}

Traceback:

    for r in rs:
  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 1556, in __next__
    self._do_search()
  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 1499, in _do_search
    response = self._search_raw_multi()
  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 1526, in _search_raw_multi
    doc_types_list=self.doc_types_list, routing_list=self.routing_list)
  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 967, in search_raw_multi
    return body, self._send_request('GET', path, body)
  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 416, in _send_request
    raise_if_error(response.status, decoded)
  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/convert_errors.py", line 70, in raise_if_error
    if '; nested: ' in error:
TypeError: argument of type 'NoneType' is not iterable

Elasticsearch info:

{
    "version": {
        "number": "1.1.1",
        "build_hash": "f1585f096d3f3985e73456debdc1a0745f512bbc",
        "build_timestamp": "2014-04-16T14:27:12Z",
        "build_snapshot": false,
        "lucene_version": 4.7
    }
}

pyes version: 0.90.0 (problem exists in 0.99.2 as well)

imhoffd commented 10 years ago

And, in the meantime, I can't even use search_raw_multi because it appears to be broken as well.

I pass into it a list of Query objects, and I get this error:

  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 943, in search_raw_multi
    queries = list(map(self._encode_query, queries))
  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 1155, in _encode_query
    % query.__class__)
pyes.exceptions.InvalidQuery: `query` must be Query or dict instance, not <class 'pyes.query.Search'>

I pass into it a list of raw dict queries (made with serialize()), and I get this error:

  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 942, in search_raw_multi
    else query.serialize() for query in queries]
  File "/home/vagrant/.python/lib/python3.3/site-packages/pyes/es.py", line 942, in <listcomp>
    else query.serialize() for query in queries]
AttributeError: 'dict' object has no attribute 'serialize'

Probably due to the logic on these lines: https://github.com/aparo/pyes/blob/master/pyes/es.py#L1023-L1024

imhoffd commented 10 years ago

Ah, for the header part of the msearch format, it is filling in nothing. If I do something like this, it works:

indexes = [client_index for _ in range(0, 100)]
rs = es.search_multi(queries=queries, indices_list=indexes)

It should be defaulting to the default index. If I'm off base here then the API could definitely use some documentation on search_multi.

thejeff77 commented 9 years ago

I'm using pyes version 0.99.5, and I found out that the problem is that the search is never executed. The call to search_multi constructs a ResultSetMulti from all the parameters that you pass in, but the class it returns never has any results in it. I figured out that there is a private method in ResultSetMulti called _do_search() that executes search_raw_multi, and populates the class's result data. This method is of course... never called. So... Here is the code:

 result = es_conn.search_multi(queries=pyes_searches, indices_list=index_list, doc_types_list=doc_type_list, search_type_list=search_type_list, routing_list=routing_list)
 result._do_search()
 return result

.search works just fine without digging into the code to find some private method that is never called, but not with search_multi!