mozilla / elasticutils

[deprecated] A friendly chainable ElasticSearch interface for python
http://elasticutils.rtfd.org
BSD 3-Clause "New" or "Revised" License
243 stars 76 forks source link

bulk monkeypatch incorrectly assumes we're indexing #266

Closed adriaant closed 9 years ago

adriaant commented 9 years ago

Version 0.10.1 does not contain this commit: https://github.com/mozilla/elasticutils/commit/07284b980fa5a2875e10d6bbff91c0bc14a8315b

and will therefore break with the latest ES and ES-py.

willkg commented 9 years ago

I don't understand what you're talking about. Can you be more specific with version numbers and possibly an example?

adriaant commented 9 years ago

Current version on Pypi is 0.10.1 and if I look at the code that pip has installed, I don't see the conditional if VERSION == (0, 4, 5): that is implemented in commit https://github.com/mozilla/elasticutils/commit/07284b980fa5a2875e10d6bbff91c0bc14a8315b whereas that commit should have been included in the 0.10.1 release, because it was pushed in the v0.10.1 branch. That lead me to wonder if something was overlooked when you released 0.10.1 to pypi.

With the version on Pypi and the latest official elasticsearch-py, you'll bump into KeyErrors when using the bulk insert helper, because the monkeypatch is wrapping it with a bugged function (fixed in the referred commit but not in the release).

willkg commented 9 years ago

Which versions of Elasticsearch and elasticsearch-py?

adriaant commented 9 years ago

ElasticSearch: 1.3.2 elasticsearch-py: 1.2.0

willkg commented 9 years ago

Here's what I know:

  1. I inverted the monkeypatch for 0.10.1 and changed the elasticsearch-py requirement to >= 1.0. That fixed a lot of issues with requiring 0.4.5 and no later version. That's in f9f7ede3e3d5652d9ae225c3339ae0dd0be967dc.
  2. We're not testing with Elasticsearch 1.3, so I maybe that's the problem? http://elasticutils.readthedocs.org/en/latest/changelog.html#version-0-10-1-september-22nd-2014
  3. It should work fine with elasticsearch-py 1.2.0. That's tested: https://github.com/mozilla/elasticutils/blob/master/tox.ini#L23

If you downgrade to Elasticsearch 1.2, does it work fine? Can you copy/paste the traceback you're getting?

adriaant commented 9 years ago

Downgrading is not really an option for me. I'll go fork the project and see if can just drop the whole monkeypatch.

willkg commented 9 years ago

I'm feeling pretty frustrated with this issue. I want to help you, but I need to understand the specifics of what you're observing which isn't happening in our tests. Thus I'm asking questions to clarify what you're saying so that I can then reproduce the issue and then figure out what's going on.

I presume you have a development environment. Running a lower version of Elasticsearch is pretty easy to do. Can you downgrade to Elasticsearch 1.2 in your development environment and reproduce your issue? I'm not asking you to downgrade everything permanently.

Can you copy/paste the traceback you're getting?

adriaant commented 9 years ago

I downgraded to elasticsearch 1.2.2 (version 1.2. was not available from the ES site). I ran a bulk with one document with op_type 'delete'

Stacktrace:

[09/Oct/2014 14:58:36] DEBUG [elasticsearch:59] > {"delete": {"_type": "product", "_id": 113614, "_index": "products"}}

[09/Oct/2014 14:58:36] DEBUG [elasticsearch:60] < {"took":11,"errors":false,"items":[{"delete":{"_index":"products","_type":"product","_id":"113614","_version":1,"status":404,"found":false}}]}
[09/Oct/2014 14:58:36] INFO [elasticsearch.trace:67] curl -XPOST 'http://localhost:9200/_bulk?pretty' -d '{
  "delete": {
    "_id": 113614,
    "_index": "products",
    "_type": "product"
  }
}'
[09/Oct/2014 14:58:36] DEBUG [elasticsearch.trace:70] #[200] (0.020s)
#{
#  "errors": false,
#  "items": [
#    {
#      "delete": {
#        "_id": "113614",
#        "_index": "products",
#        "_type": "product",
#        "_version": 1,
#        "found": false,
#        "status": 404
#      }
#    }
#  ],
#  "took": 11
#}
[09/Oct/2014 14:58:36] CRITICAL [apps:75] Error in bulk delete: KeyError('index',)
KeyError: KeyError('index',)
  File "/Users/adriaant/sandbox/cdw/cdw/apps/search/tasks.py", line 66, in delete_objects
    res = bulk(es, actions)

  File "/Users/adriaant/.virtualenvs/cdw/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 145, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):

  File "/Users/adriaant/.virtualenvs/cdw/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 104, in streaming_bulk
    resp = client.bulk(bulk_actions, **kwargs)

  File "/Users/adriaant/.virtualenvs/cdw/lib/python2.7/site-packages/elasticutils/monkeypatch.py", line 33, in _fixed_bulk
    ret['items'] = [fix_item(item) for item in ret['items']]

  File "/Users/adriaant/.virtualenvs/cdw/lib/python2.7/site-packages/elasticutils/monkeypatch.py", line 27, in fix_item
    if 'ok' in item['index']:
willkg commented 9 years ago

Aha! Thank you for providing the key information.

What's going on is that none of the ElasticUtils tests do anything with bulk other than indexing. That's why the monkey patch is failing.

I'm repurposing this issue to something more correct.

adriaant commented 9 years ago

Ok, that makes sense. Thanks much for your time!

willkg commented 9 years ago

I'll try to fix this today or tomorrow and push out a 0.10.2.

adriaant commented 9 years ago

Awesome, thanks!

willkg commented 9 years ago

That should fix it. I'll do a 0.10.2 release now.

adriaant commented 9 years ago

Yep, works well now, thanks much!