Fix bulk indexing problems with Elasticsearch 1.0

mozilla / elasticutils

[deprecated] A friendly chainable ElasticSearch interface for python

http://elasticutils.rtfd.org

BSD 3-Clause "New" or "Revised" License

243 stars 76 forks source link

Fix bulk indexing problems with Elasticsearch 1.0 #242

Closed willkg closed 10 years ago

willkg commented 10 years ago

In order for ElasticUtils to work for both Elasticsearch 0.90 and Elasticsearch 1.0 using elasticsearch-py 0.4.5, we need to do some monkey-patching of elasticsearch-py.

In this case, calling Elasticsearch.client.bulk() returns an 'ok' field with ES 0.90 and a 'status' field with ES 1.0. This patch sets the 'ok' field based on the 'status' field so that the bulk indexing infrastructure in elasticsearch-py 0.4.5 is testing the right thing and not raising BulkIndexingErrors.

Fixes #241

willkg commented 10 years ago

This is kind of gross, but I don't see another way to do this.

Incidentally, this path might be fraught with danger since there are many API differences between ES 0.90 and 1.0, but I want to get as far as we can because if we can get most of the way, that's probably good enough for an EU 0.10 to ease the migration path for all the things.

willkg commented 10 years ago

Travis is currently testing changes with both Elasticsearch 0.90 and Elasticsearch 1.0. So if it's happy, we're good on that front.

Mostly, I need a code review here. Is the code crazy-pants? Is there a nicer way to do it?

Wilfred commented 10 years ago

This looks completely reasonable to me.

Long term, it would probably make sense to move to support elasticsearch-py 1.0. Perhaps EU could throw an exception if the user had elasticsearch-py 0.4.X installed for talking to a 1.0 elasticsearch machine, and vice versa?

willkg commented 10 years ago

elasticsearch-py 1.0+ doesn't support Elasticsearch 0.90. So, yes--we definitely want to switch but can't for this version without doing more hackery.

Plus there's no way to know what Elasticsearch EU is talking to without talking to it and I don't think we want to be asking Elasticsearch that all the time. Plus I'm not sure if people run in environments that have multiple Elasticsearch clusters of different versions (I really hope not).