ChristopherRabotin / bungiesearch

UNMAINTAINED CODE -- Elasticsearch-dsl-py django wrapper with mapping generator
BSD 3-Clause "New" or "Revised" License
67 stars 20 forks source link

Performance boost for bulk delete, and additional abstraction in utils #99

Closed afrancis13 closed 9 years ago

afrancis13 commented 9 years ago

I added bulk delete functionality around a couple weeks ago, but I noticed that this could be done with significantly improved efficiency. Instead of serializing the entire document over batches, you can just specify the primary key for elasticsearch (often integers) along with the operation type ('delete') and delete in that fashion (see streaming_bulk in https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/helpers/__init__.py) Since things were getting a little cluttered inside update, I pulled some sub-functionality out of there, put them in helper functions, and added documentation. I'll comment on one other thing below.