yakaz / elasticsearch-action-updatebyquery

ElasticSearch Update By Query action plugin
113 stars 24 forks source link

Seems to eat up a lot of memories #7

Open wasyyyy opened 10 years ago

wasyyyy commented 10 years ago

I tried this cool plugin on my test machine with 32GB RAM. After I allocated 12GB memories to JVM and restarted the ES, I submitted a update_by_query request which updates approximately 81,000 documents among 20,000,000 total. After several minutes I saw OOM exception thrown in the ES console. I have to do two to three the same requests to complete all the updates for those 81,000 documents. Do you have any idea why this happens? Could it be a bug? The documents affected by this request is definitely less than 1GB of course.

By the way, updating 21,321 documents (among those 20,000,000) takes 14.4 minutes. A little longer than expected, do you have any suggestions to speed up this process? I'd really appreciate if you could help me with this.

ofavre commented 10 years ago

Have you tried tweaking the action.updatebyquery.bulk_size elasticsearch configuration option?

wasyyyy commented 10 years ago

Thanks for your quick reply! I did what you suggest setting the bulk size yesterday before I left the office. I set the bulk_size to 500 and I found it successfully updated all the 81,000 records but with almost 3 hours. But the ES blew the memory afterwards anyway. No idea why but I will try to tune the bulk_size parameter and see if this works. Again, thanks very much! I will keep you posted on this.

ofavre commented 10 years ago

Do you still face the same problem with v2.2.0 of the plugin and ES 1.3?

pentium10 commented 9 years ago

@wasyyyy We are looking to find out if you still have the issue, and what EL and plugin version you used. We don't use yet the plugin, but we are actively considering, so some recent updates and how it performs it would be handy.