How to set 'threadpoolsize', 'interval', 'max_bulk_actions' and etc to avoid this error?

tolinwei commented 9 years ago

I deleted and re-submitted a river job. After the execution, I noticed that parts of the data in the index was NOT updated, by tailing the log file, there are lots errors like this:

[1744]: index [cerebro_unnested], type [profile], id [63593243], message [EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@6b2b592b]]
[1745]: index [cerebro_unnested], type [profile], id [63593244], message [EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@265ec505]]
[1746]: index [cerebro_unnested], type [profile], id [63593245], message [EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@7e0d7872]]
[1747]: index [cerebro_unnested], type [profile], id [63593246], message [EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@17d8def1]]

I tried to search this error, and there're several posts say that it's because of too many bulk operation to be contained in the queue, we need to adjust the value of

threadpool.bulk.queue_size: {number of queue size}

I also noticed that there are parameters like 'threadpoolsize', 'interval', 'max_bulk_actions' and etc in the README of this repo here. Is there any suggestion to adjust the setting to avoid this error? Thanks in advance.

jprante commented 9 years ago

You should increase max_bulk_actions to a higher number like 5000 or 10000, this reduces the number of concurrent bulk requests.

Changing threadpool.bulk.queue_size is not the correct solution and should be avoided because it overwhelms the cluster when value is set too high. The default value is carefully chosen and is ok.

tolinwei commented 9 years ago

Thanks @jprante Now I understand the point that not to change the value of threadpool.bulk.queue_size. However, I didn't set the value of max_bulk_actions, which mean it's 10000 by default according to your README, why would this error still happens?

jprante commented 9 years ago

In that case, the ES cluster might got very slow in accepting bulk index requests so they piled up. Maybe the server log file contains messages with more details.

sliontc commented 8 years ago

I recently got this error too, with max_bulk_actions:20000,

[1277]: index [authors], type [author], id [19341514], message [RemoteTransportException[[node-181][192.168.1.181:9300][indices:data/write/bulk[s]]]; nested: EsRejectedExecutionException[rejected execution of org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase$1@85080 on EsThreadPoolExecutor[bulk, queue capacity = 50, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@1cc1d3f[Running, pool size = 8, active threads = 8, queued tasks = 50, completed tasks = 3300]]];]

Any help?

jprante / elasticsearch-jdbc

How to set 'threadpoolsize', 'interval', 'max_bulk_actions' and etc to avoid this error? #444

I recently got this error too, with max_bulk_actions:20000,