Open cfangpp opened 6 months ago
Pinging @elastic/es-search (Team:Search)
IO throttling is an interesting problem, in fact, we are working on adjusting the logic there substantially in Lucene: https://github.com/apache/lucene/pull/13293
On modern NVMe drives, the way throttling works now doesn't make sense.
Have you tried turning off auto-throttling all together to see if your performance improves?
index.merge.scheduler.auto_throttle
setting this to false.
Thanks for you reply.
My cluster run on mechanical hard drive, 12 drives per node, i think io throttle was very useful for this. but elasticsearch's IndexThrottle is limit indexing when launched MergeThread count more than max_merge_count on below code: "EngineMergeScheduler.java"
public synchronized void beforeMerge(OnGoingMerge merge) {
int maxNumMerges = mergeScheduler.getMaxMergeCount();
if (numMergesInFlight.incrementAndGet() > maxNumMerges) {
if (isThrottling.getAndSet(true) == false) {
logger.info("now throttling indexing: numMergesInFlight={}, maxNumMerges={}", numMergesInFlight, maxNumMerges);
activateThrottling();
}
}
}
I just used "ElasticsearchConcurrentMergeScheduler.maybeStall" to limit to launch MergeThread to avoid IndexThrottle activate, and it was reduced a most of rejections for indexing
@benwtrent
Pinging @elastic/es-distributed (Team:Distributed)
Description
Lucene just according max_thread_count to pause MergeThread through IO throttling,when pending a lot of merges, ConcurrentMergeScheduler will launched corresponding quantity MergeThread and start running, then through IO throttling to pause the largest MergeThead if it's thread idx more than max_thread_count。
So if Lucene IndexWriter pending may merges, ElasticsearchConcurrentMergeScheduler will activate IndexThrottle when inflight merges more than max_merge_count, request that this engine throttle incoming indexing requests to one thread, A large number of index requests maybe be rejected due to this。
I think many MergeThread be paused by Lucene's IO throttling, might as well stop it to launching new MergeThread. can I used ElasticsearchConcurrentMergeScheduler.maybeStall method to limit to launch and run new MergeThread by max_thread_count or max_merge_count limited?