elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.56k stars 24.62k forks source link

Usage-based caching of filters can be ineffective without tuning #16031

Closed dbaggott closed 8 years ago

dbaggott commented 8 years ago

For some use cases, the default settings of UsageTrackingFilterCachingPolicy are inappropriate. It would be ideal if those settings were configurable.

For example, the default configuration includes a history that looks at the last 256 filters. In the extreme case, if you had a stream of fairly complicated filters where the total number of distinct filters was significantly greater than 256, then you'd never get caching of any filters!

In a real world example, we just went through an upgrade from ES 1.7 to 2.1.1 and experienced pretty severe degradation of response times. Analysis of the query_cache indicated that a) the cache was relatively sparsely populated; b) the miss ratio was very high. Our particular use case was not well handled b/c we filter documents using arbitrary boolean combinations of ~1k terms. As best as I could tell, the limited 256 history size meant that we were only caching the very most common terms. Inefficient caching combined with a fairly large document count led to increased response times across the board and very painful response times under heavier load.

Tellingly, what rescued performance was completely bypassing the usage-based cache via the undocumented index.queries.cache.everything: true setting and relying purely on the LRU-nature of the cache to keep the most useful filters around.

Tuning the usage policy would be great. And, arguably, documented support of a policy that simply relies on using LRU to determine the useful filters would also be a good idea. (It did not escape my attention that the index.queries.cache.everything setting appears to have only been added for testing purposes!)

jrots commented 8 years ago

+1, been experiencing similar issues and been looking for this setting too for a while!

dbaggott commented 8 years ago

Related question: I know the usage policy (including the underlying buffer that stores the history) is scoped to the shard but isn't the actual usage accounting done at the segment level? IF that's true, the mismatch adds a bit of complexity/variability as the runtime behavior will vary with the count of (large enough) segments -- in effect, segment count will shorten the effective history size. And that, in turn, alters the significance of the thresholds. So, as segments come and go, the caching behavior will subtly (or not so subtly) change.

synhershko commented 8 years ago

We are also experiencing a very similar issue. In our case it's not many different filters that overwhelm the LRU cache, but a single query with what could be considered a very heavy Terms Query in a filter context (thousands of terms on one field; think permissions filtering scenario), that simply refuse to get cached. Setting the aforementioned configuration option on the nodes cause it to be cached, dropping query latency from 15 seconds to less than 1 second.

@dbaggott to be fair, historically the query cache will represent the cache of the query results, not the filters being used. This seems to still be the case: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-cache.html. Pre 2.x there was a filter cache, which was either removed (since it's not managed by ES anymore apparently, but by Lucene) or hidden (since the user has no control over it anymore). Be that as it may, I'm pretty sure the current visibility on filter caching behavior is zero.

When Elasticsearch moved to automatic decision making on filter caching following Lucene's path I suspected such errors will arise - no automatic decision process is perfect. An all-or-nothing approach like the OP and us are having is a bit disastrous. Best solution would be to bring back at least some user control over filter-caching, a la cache key which is probably the most powerful feature ES's query DSL had (@bleskes, we discussed this shortly not long ago..).

I'm hoping ES / Lucene folks will realise some level of control over caches should be given to users - for error scenarios or for advanced usages.

dbaggott commented 8 years ago

@synhershko, the documentation and the terminology are all a little confusing given that, as you alluded to, queries and filters merged but the "query cache" I'm referring to and the one in the documentation you mentioned is for *filters•. Note the following (my emphasis):

The query cache only caches queries which are being used in a filter context.

This cache is not to be confused with the shard request cache.

As for visibility into the cache where filters are stored, there is good visibility into node-level stats via the node stats api: curl -XGET 'http://localhost:9200/_nodes/stats?pretty' (look for the "query_cache" entry). It include count, hits, misses, evictions, etc. You cannot interrogate the actual contents of the cache (only the hash of the filter and the doc id set are stored).

I'm guessing your problem is exactly the problem of the history being too short for your use case. A query in a filter context would be cached according to the same usage policy I'm referring to and each of your multiple terms is probably taking up a slot in the history buffer -- so the history never sees the same term more than once and caching is effectively disabled under the default policy.

jpountz commented 8 years ago

I am all for improving how query caching works, but first I would like to make sure that we are not jumping too quickly to the conclusion. If you have identified requests that are much slower in 2.1 than in 1.7, would you be able to share them and capture hot threads while they are running (several times if possible) so that we can get an idea of what the bottleneck is?

jpountz commented 8 years ago

I know the usage policy (including the underlying buffer that stores the history) is scoped to the shard but isn't the actual usage accounting done at the segment level?

No, actual usage accounting is done at the shard level too.

dbaggott commented 8 years ago

@jpountz, I'll gather documentation in support of the conclusion.

Is the conceptual problem clear from from my description though? It's probably also easy to write a unit test against the default UsageTrackingFilterCachingPolicy that illustrates the problem case and then any query with a sufficiently complex filter context is going to trigger the problem.

Also, just to make sure I understand:

No, actual usage accounting is done at the shard level too.

If there is a shard with 3 segments that are large enough to satisfy the segment policy, how many times will UsageTrackingFilterCachingPolicy.shouldCache be called? You're saying only once, right?

jpountz commented 8 years ago

Is the conceptual problem clear from from my description though? It's probably also easy to write a unit test against the default UsageTrackingFilterCachingPolicy that illustrates the problem case and then any query with a sufficiently complex filter context is going to trigger the problem.

Yes it is clear to me. If you have queries that involve hundreds of filters, then nothing will be cached. But there has to be a limit anyway so you will always be able to construct queries that always bypass the cache.

I am not denying the fact that the query cache might be the problem but I would like to make sure it is in your case. There are so many things that happen at search time, I have been surprised many times how the actual cause of a slowdown was very different from my initial expectations in spite of the fact that understanding slow queries is something that I do all the time.

The problem with this cache is that it does not behave like a regular cache, that can only make things faster. On the contrary, many queries leverage on-disk skip lists in order to only read the information that they are interested in. By caching them, we are forced to read every single matching document so that we can build a cached entry. So the worst thing that this cache could do would be to cache matching docs and then never reuse it. Unfortunately this happened a lot in 1.x, hence the pickyness of the new cache to get evidence of reuse before caching a filter.

If there is a shard with 3 segments that are large enough to satisfy the segment policy, how many times will UsageTrackingFilterCachingPolicy.shouldCache be called? You're saying only once, right?

shouldCache will be called 3 times, but onCache which increases the counters will be called only once.

dbaggott commented 8 years ago

Thanks for the explanations (and, w/r/t the shard-level vs segment-level question, I'm assuming you meant onUse when you said onCache).

Here's some additional information surrounding what we're seeing.

In a cluster running 2.1.1, we configured it so all but one node had the index.queries.cache.everything: true setting. We then ran a realistic query mixture against the cluster in the ballpark of 100 requests per second for about 10 minutes.

The one node without the "cache everything" behavior showed dramatically higher cpu for the entire duration as seen below (the drop off at the end is when load stopped)

data node cpu

Here are some hot threads for the one node without cache everything behavior:

::: {data-node-without-cache-everything}{xMs5UBKwSq6DdNsoVSzBKw}{ip}{ip:9300}{max_local_storage_nodes=1, aws_availability_zone=us-west-2c, master=false}
   Hot threads at 2016-01-18T20:21:52.375Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   81.1% (405.7ms out of 500ms) cpu usage by thread 'elasticsearch[data-node-without-cache-everything][search][T#6]'
     3/10 snapshots sharing following 40 elements
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
       org.apache.lucene.codecs.lucene50.ForUtil.readBlock(ForUtil.java:197)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs(Lucene50PostingsReader.java:368)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc(Lucene50PostingsReader.java:393)
       org.apache.lucene.util.BitSet.or(BitSet.java:94)
       org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:271)
       org.apache.lucene.util.DocIdSetBuilder.add(DocIdSetBuilder.java:87)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:286)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinWeight.scorer(ToParentBlockJoinQuery.java:160)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorWeight.scorer(FunctionScoreQuery.java:132)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     7/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   76.9% (384.6ms out of 500ms) cpu usage by thread 'elasticsearch[data-node-without-cache-everything][search][T#23]'
     4/10 snapshots sharing following 40 elements
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
       org.apache.lucene.codecs.lucene50.ForUtil.readBlock(ForUtil.java:197)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs(Lucene50PostingsReader.java:368)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc(Lucene50PostingsReader.java:393)
       org.apache.lucene.util.BitSet.or(BitSet.java:94)
       org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:271)
       org.apache.lucene.util.DocIdSetBuilder.add(DocIdSetBuilder.java:87)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:286)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinWeight.scorer(ToParentBlockJoinQuery.java:160)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorWeight.scorer(FunctionScoreQuery.java:132)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 36 elements
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
       org.apache.lucene.codecs.lucene50.ForUtil.readBlock(ForUtil.java:197)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs(Lucene50PostingsReader.java:368)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc(Lucene50PostingsReader.java:393)
       org.apache.lucene.util.BitSet.or(BitSet.java:94)
       org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:271)
       org.apache.lucene.util.DocIdSetBuilder.add(DocIdSetBuilder.java:87)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:297)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.search.aggregations.bucket.filter.FilterAggregator.getLeafCollector(FilterAggregator.java:61)
       org.elasticsearch.search.aggregations.AggregatorBase.getLeafCollector(AggregatorBase.java:132)
       org.elasticsearch.search.aggregations.AggregatorBase.getLeafCollector(AggregatorBase.java:131)
       org.elasticsearch.search.aggregations.AggregatorBase.getLeafCollector(AggregatorBase.java:38)
       org.apache.lucene.search.MultiCollector.getLeafCollector(MultiCollector.java:117)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:763)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 36 elements
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
       org.apache.lucene.codecs.lucene50.ForUtil.readBlock(ForUtil.java:197)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs(Lucene50PostingsReader.java:368)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc(Lucene50PostingsReader.java:393)
       org.apache.lucene.util.BitSet.or(BitSet.java:94)
       org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:271)
       org.apache.lucene.util.DocIdSetBuilder.add(DocIdSetBuilder.java:87)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:286)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.search.aggregations.bucket.filter.FilterAggregator.getLeafCollector(FilterAggregator.java:61)
       org.elasticsearch.search.aggregations.AggregatorBase.getLeafCollector(AggregatorBase.java:132)
       org.elasticsearch.search.aggregations.AggregatorBase.getLeafCollector(AggregatorBase.java:131)
       org.elasticsearch.search.aggregations.AggregatorBase.getLeafCollector(AggregatorBase.java:38)
       org.apache.lucene.search.MultiCollector.getLeafCollector(MultiCollector.java:117)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:763)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 45 elements
       sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
       sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
       sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
       sun.nio.ch.IOUtil.read(IOUtil.java:197)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:741)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
       org.apache.lucene.codecs.lucene50.ForUtil.readBlock(ForUtil.java:197)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs(Lucene50PostingsReader.java:368)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc(Lucene50PostingsReader.java:393)
       org.apache.lucene.util.BitSet.or(BitSet.java:94)
       org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:271)
       org.apache.lucene.util.DocIdSetBuilder.add(DocIdSetBuilder.java:87)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:297)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinWeight.scorer(ToParentBlockJoinQuery.java:160)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorWeight.scorer(FunctionScoreQuery.java:132)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   72.2% (360.7ms out of 500ms) cpu usage by thread 'elasticsearch[data-node-without-cache-everything][search][T#11]'
     2/10 snapshots sharing following 43 elements
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
       org.apache.lucene.codecs.lucene50.ForUtil.readBlock(ForUtil.java:197)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs(Lucene50PostingsReader.java:368)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc(Lucene50PostingsReader.java:393)
       org.apache.lucene.util.BitSet.or(BitSet.java:94)
       org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:271)
       org.apache.lucene.util.DocIdSetBuilder.add(DocIdSetBuilder.java:87)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:297)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinWeight.scorer(ToParentBlockJoinQuery.java:160)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorWeight.scorer(FunctionScoreQuery.java:132)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     6/10 snapshots sharing following 43 elements
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
       org.apache.lucene.codecs.lucene50.ForUtil.readBlock(ForUtil.java:197)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs(Lucene50PostingsReader.java:368)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.nextDoc(Lucene50PostingsReader.java:393)
       org.apache.lucene.util.BitSet.or(BitSet.java:94)
       org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:271)
       org.apache.lucene.util.DocIdSetBuilder.add(DocIdSetBuilder.java:87)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:286)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinWeight.scorer(ToParentBlockJoinQuery.java:160)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorWeight.scorer(FunctionScoreQuery.java:132)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 28 elements
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinWeight.scorer(ToParentBlockJoinQuery.java:160)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorWeight.scorer(FunctionScoreQuery.java:132)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

Here's the general style of the filter portion of the queries that struggle without the cache everything behavior. There are further variations on this but I think this captures the core problem:

"filter": {
  "bool": {
    "must": [
      {
        "nested": {
          "query": {
            "terms": {
              "items.fieldA": [ 1, 2, 3 ]
            }
          },
          "path": "items"
        }
      },
      {
        "bool": {
          "must_not": {
            "multi_match": {
              "query": "multiple terms that stay constant across most queries",
              "fields": [ "field1", "field2" ]
            }
          }
        }
      }
    ]
  }
}

It's the "1, 2, 3" portion of the nested "items.fieldA" that varies. The actual values are arbitrary and can range from 1 value to 100s of values. In practice, there's probably a couple hundred different combinations of values (some with a few values some with 100s) that represent the bulk of the variants. However, many more combinations are requested with less frequency. This is the core of the problem for us, the 256 history is just too short to allow the relevant filters to be cached and it's only when, by luck, there's a series of searches with a small number of values that are repeated across searches that we get any caching! And we probably never get much caching of the "non-leaf" filters.

Finally, I don't have the query_stats data handy but without the "cache everything" behavior, we see a few hundred filters cached and a very high miss ratio. With the "cache everything" behavior, we see many 1000s of filters cached and a very high hit ratio.

Let me know if that's sufficient information or if anything is unclear.

Thanks!

jpountz commented 8 years ago

I'm assuming you meant onUse when you said onCache

Woops indeed.

Thanks for the info. The hot threads suggest that the bottleneck is reading posting data from disk. Could you report how much memory your nodes have, how much of it is given to the JVM and how large your data directory is? I am thinking that maybe you did not give enough memory to the filesystem cache? Since the hit ratio of the filter cache is high when caching everything, then the same should be true for the filesystem cache.

Could you also check whether putting index.store.type: mmapfs in your elasticsearch.yml file (and restart) makes the performance any different?

dbaggott commented 8 years ago

Interesting.

Memory: 64 GB, with 30 GB given to the JVM (and nothing else of note running on the box). Data: 33G /var/opt/elasticsearch

I'll need to get back to you re the mmapfs setting, I can't readily test that.

I should mention that we're under a constant stream of index updates right now as we're adding additional data to every document -- prior to enabling the "cache everything" behavior, we initially suspected the relatively heavier indexing and the higher percentage of deleted documents were to blame for performance. However, we suspended the indexer for a couple of hours and it didn't help with performance so we dismissed that as a concern. Similarly, we removed the deleted documents (via a force merge) but that didn't help either.

dbaggott commented 8 years ago

Also, I said "box" but these are EC2 instances and the data is on EBS gp2 (300 / 3000) volumes -- that has obvious ramifications on disk access...

jpountz commented 8 years ago

Your filesystem cache is almost the size of your data directory so I would expect the filesystem cache to do a very good job at caching posting data. The fact that caching everything helps and that you are seeing a high hit rate means that many filters are reused, which should also help with no query caching at all since the same data would be read from disk over and over again.

Another data point that could be interesting would be to know if you notice a difference if you force posting data to be loaded in the fs cache (for instance, run cat $DATA_DIR/$CLUSTER_NAME/nodes/*/indices/*/*/index/*.doc > /dev/null just before running your experiment, no restart needed) and whether the hot threads still look the same.

jimczi commented 8 years ago

The hot threads show that the term queries with more than 16 terms are the bottleneck here. When there is more than 16 terms the terms query populates a bit set with the matching docs and then returns a Scorer over this bit set. This is done in isolation (the other parts of the query are not taken into account when the bit set is populated). Your results with the cache all heuristic indicates that you have a lot of query that share the filter part (the same terms are used). Is it the expected behavior ? One thing to notice here is that the big terms query act exactly like if they were cached, the bit set contains the matching documents of the terms query alone, the only difference is that the bit set is not added to the cache at the end :(. This means that the cost of caching this query vs not caching it would be very small in terms of computation. @dbaggott can you compare the number of results between the big terms query alone (more than 16 terms) vs the non-filter part of your query. If the difference is big it might be good to test to split this big terms query into smaller ones (with 16 terms each) like below:

"bool": {
    "should": [
        {
            "terms": {
                "items.fieldA": [A_BLOCK_OF_16_TERMS ]
            }
        },
         {
            "terms": {
                "items.fieldA": [ANOTHER_BLOCK_OF_16_TERMS ]
            }
        }
     ]
}

... this could help only if the non-filter part of the query returns few results.

Another thing, in the cache there is a distinction between costly and cheap queries. Costly queries are cached when we see them twice in the 256 history whereas normal ones are cached after 5 appearances. It could be helpful to add the terms query with more than 16 terms in the list of costly queries (@jpountz ?), it seems that they are not.

jpountz commented 8 years ago

Admittedly I did not check this possibility: my assumption was that if the terms query execution was the bottleneck then the hot threads would point either to the code that decodes postings lists or to the code that builds the DocIdSet. But here all stacks are pointing to disk reads, even though they should be almost free since the filters appear to be reused, which confuses me a bit. I suggested to test mith mmapfs to see if it makes things better to read directly from the fs cache instead of having a (quite large) intermediate buffer like niofs does.

That said, the suggestion to cache large terms filters more aggressively sounds good to me.

dbaggott commented 8 years ago

Thank you so much for the feedback and suggestions. I'm still working on setting up a proper testing environment so that I can perform ad hoc tests to my heart's content so I don't have feedback on your suggestions. Yet.

Your results with the cache all heuristic indicates that you have a lot of query that share the filter part (the same terms are used). Is it the expected behavior ?

Yes, that is definitely expected. The actual values of the terms are arbitrary and can range from 1 value to 100s of values. In practice, when you analyze the stream of incoming queries you see a couple hundred distinct combinations of terms (some combinations with just a few terms some with 100s) that represent the vast bulk of the variants. So, heavy reuse is expected and a very high hit ratio is exactly what I want/see with the "cache all" behavior.

When there is more than 16 terms the terms query populates a bit set with the matching docs and then returns a Scorer over this bit set. This is done in isolation (the other parts of the query are not taken into account when the bit set is populated).

At what point are the other parts of the filter taken into account?

One thing to notice here is that the big terms query act exactly like if they were cached, the bit set contains the matching documents of the terms query alone, the only difference is that the bit set is not added to the cache at the end :(. This means that the cost of caching this query vs not caching it would be very small in terms of computation.

Can you clarify what you mean by this? Particularly the very last statement. Are you saying that the individual bit sets for each term alone is (potentially) cached and so the calculation of the overall bit set can be derived (relatively) cheaply from the individual (hopefully) cached bit sets?

...can you compare the number of results between the big terms query alone (more than 16 terms) vs the non-filter part of your query. If the difference is big...

The delta will vary considerably depending on the exact nature of the query. But, in general, the larger the "big terms" portion of the filter, the more likely it is that the delta will be large. So, if this is a core problem, maybe it could still be a net win. What are the downsides to breaking up the large terms filter when the query portion does NOT return few results? I assume there must be some otherwise it would be implemented as an internal optimization?

Ok, so avenues of investigation to understand/improve the performance of the non-cached filters are:

  1. mmapfs configuration
  2. force posting data to be cached in the fs (@jpountz, this is most interestingly done prior to the mmapfs change, right?)
  3. investigate delta between query-only portion of request and filter-only portion of request and split the "large terms" portion of the filter into groups no bigger than 16

That said, the suggestion to cache large terms filters more aggressively sounds good to me.

That sounds good to me too although, even if it's applicable to the performance of my query stream, it wouldn't help my particular use case much, if at all, given that the history size is insufficiently small to accurately sample my particular filter stream...

The performance suggestions are awesome (thank you!) and deserve to be worked through as the first-tier problem -- I'd much rather have performance improved to the point where caching wasn't necessary then to rely on caching. Or, put another way, I don't want caching to cover up fundamental performance issues!

That being said, even if my use case no longer has a need for filter caching, there's still a fundamental issue with usage tracking of filters: ie, in order to detect repeated usage, the sampling methodology must accommodate the amount of variance in the data stream. Otherwise, filter reuse won't be detected properly.

The challenge (as I'm sure you know) is to solve that problem in a generic fashion that doesn't expose a bunch of configuration complexity and doesn't require a priori understanding of your query complexity (which is prone to changing anyway). I can imagine a time-based rolling window implementation would be more forgiving and therefore closer to "one size fits all". Reservoir sampling might be helpful. I can also imagine a variation of the current history-based one that simply dynamically resizes the history length based on the complexity of the filters. The idea being that if you never see any repeated filters (or, more realistically, the percentage of repeated filters is below some threshold) then the history length is increased. The various thresholds for caching would then be expressed as percentages instead of in absolute terms...

jimczi commented 8 years ago

At what point are the other parts of the filter taken into account?

Just after, suppose you have a terms query with more than 16 terms inside a boolean query with two other clauses. First the bit set with the matching docs for the terms query is build and after that the bit set is used in conjunction with the two other clauses to build the final matching docs. This is done to avoid the explosion of boolean clauses when the number of terms is big.

Can you clarify what you mean by this? Particularly the very last statement. Are you saying that the individual bit sets for each term alone is (potentially) cached and so the calculation of the overall bit set can be derived (relatively) cheaply from the individual (hopefully) cached bit sets?

No the overall bit set for the terms query is computed which is exactly how the cache would work if the terms query was added to the cache. The cache would build the bit set of matching docs for the terms query alone. What I am saying is that terms query with more than 16 terms build this bit set even if the query is not cached. If you cache a query it has the downside of building this individual query alone, so if you have another part of your query that reduce the number of matching docs it is not taken into account. This is why when a query enters the cache the response time can be much bigger. For terms query with more than 16 terms this extra work is done every time so the cost of adding this query to the cache would be the same in terms of computation (and only computation). If this is not clear please read the javadocs of org.apache.lucene.queries.TermsQuery which explain how the terms query are built.

What are the downsides to breaking up the large terms filter when the query portion does NOT return few results?

It's not exactly a downside but in that case the current heuristic on big terms query may be faster. The order of magnitude you're looking for is something like 1000 times smaller. Something like my terms query alone returns 1M results and the query part alone returns 1000 results.

That sounds good to me too although, even if it's applicable to the performance of my query stream, it wouldn't help my particular use case much, if at all, given that the history size is insufficiently small to accurately sample my particular filter stream...

Well you must consider that your big terms query (more than 16 terms) only counts for 1 entry in the cache and not one per individual terms. This is why a more aggressive caching of big term queries may solve your problem without changing the size of the window.

dbaggott commented 8 years ago

@jimferenczi, thank you for the clarifications -- they are helpful!

And I hear you about the more aggressive caching of big term queries, it may help more than I think. But I'm still suspicious my query stream is too diverse for the history size but, ultimately, that's an empirical question and I haven't done the analysis to answer it...

I'll get back to this thread w/r/t the suggestions...

lmenezes commented 8 years ago

We have just upgraded a cluster from 1.4.1 to 2.1, and it seems this is also a problem for us. We did expect issues with the upgrade since we had a lot of queries being explicitly cached.

Anyway, having 2 clusters with the exact same configuration, data and handling exactly the same queries, we have the cluster running 2.1 with a cpu usage 30% higher when compared to 1.4.1.. Part of this could be attributed to the extra effort at indexing time(doc_values), but "pausing" indexing for awhile didn't really change things significantly.

We will give it a try with index.queries.cache.everything, even though I'm afraid this is not a long term solution.

For now, query cache for one of my data nodes(running 2.1) looks like this:

query_cache: {
  memory_size_in_bytes: 14023808,
  total_count: 1079228986,
  hit_count: 12531589,
  miss_count: 1066697397,
  cache_size: 1892,
  cache_count: 5451,
  evictions: 3559
}

Will update this on monday after having index.queries.cache.everything enabled for sometime.

@jpountz Could I gather any more relevant data for you? I do have both 1.4.1 and 2.1 running in parallel, so if you would like to compare something let me know :)

SeoJueun commented 8 years ago

Here I am sharing my use case. I am running elasticsearch as advertisements targeting backend of adserver. There are ad campaigns with multiple targeting parameter like age, carrier, device name, gender, os, region and so on. Usually I have under 5000 campaigns and index size is less than 10MB. It is very read-heavy environment but none of my filter is cached because of small documents number. It would be nice if I can customize caching policy.

jpountz commented 8 years ago

@lmenezes Thanks for offering help. Something that I am interested in would be to know what hot threads report on the 2.x cluster to try to get an idea of what the bottleneck is.

@SeoJueun The fact that none of your filters get cached suggests that you have less than 10k documents in your index. Queries should be very fast anyway on such a small index aren't they?

Notes to self about what to do/investigate next about this issue:

moliware commented 8 years ago

@jpountz I'm working with @lmenezes :)

First of all we set index.queries.cache.everything to true but the CPU usage went even higher and the performance didn't improve. The cache ratios looked better though.

This is a small list of traits that our expensive queries have:

As requested I collected some hot_threads from our data nodes:

"::: {-}{6vJGCdKfTyunL4pbFtfXNw}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.824Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   50.4% (252ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#26]'
     2/10 snapshots sharing following 35 elements
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:157)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.index.TermContext.build(TermContext.java:94)
       org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
       org.apache.lucene.search.DisjunctionMaxQuery$DisjunctionMaxWeight.<init>(DisjunctionMaxQuery.java:126)
       org.apache.lucene.search.DisjunctionMaxQuery.createWeight(DisjunctionMaxQuery.java:212)
       org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
       org.apache.lucene.search.ConstantScoreQuery.createWeight(ConstantScoreQuery.java:117)
       org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
       org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery.createWeight(FiltersFunctionScoreQuery.java:136)
       org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
       org.apache.lucene.search.BooleanWeight.<init>(BooleanWeight.java:56)
       org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
       org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
       org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
       org.elasticsearch.search.internal.ContextIndexSearcher.createNormalizedWeight(ContextIndexSearcher.java:76)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{c_fVYldtQGuCLio733YH7Q}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.825Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   43.0% (214.9ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#21]'
     4/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     unique snapshot
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   39.7% (198.7ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#32]'
     2/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{hvI7cyYYS6yL-JE-qqTgEw}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.822Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   59.7% (298.6ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#6]'
     2/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   55.2% (275.9ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#25]'
     2/10 snapshots sharing following 34 elements
       sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
       sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
       sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
       sun.nio.ch.IOUtil.read(IOUtil.java:197)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:741)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:157)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 13 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     6/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{BtlDOZKYQVeNxWHjCWht6g}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.824Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   57.6% (288.1ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#6]'
     2/10 snapshots sharing following 20 elements
       org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.scorer(MultiTermQueryConstantScoreWrapper.java:211)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:177)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 16 elements
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     6/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   52.7% (263.6ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#24]'
     2/10 snapshots sharing following 32 elements
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:157)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     3/10 snapshots sharing following 19 elements
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     5/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   52.7% (263.6ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#18]'
     4/10 snapshots sharing following 22 elements
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     6/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{k5Wscxb-Ts2XpcsMh_FVfg}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.824Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   50.1% (250.6ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#20]'
     3/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   45.3% (226.3ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#12]'
     2/10 snapshots sharing following 18 elements
       org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
       org.apache.lucene.search.BooleanWeight.<init>(BooleanWeight.java:56)
       org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
       org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
       org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
       org.elasticsearch.search.internal.ContextIndexSearcher.createNormalizedWeight(ContextIndexSearcher.java:76)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 13 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     5/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{uZZ_3YFERjem1ALPjSns8A}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.821Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   45.3% (226.4ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#21]'
     2/10 snapshots sharing following 13 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   41.8% (208.7ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#22]'
     2/10 snapshots sharing following 23 elements
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     8/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{YtYknJhKTZS2KWWWzD-LCQ}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.824Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   46.6% (233.1ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#14]'
     3/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   41.5% (207.4ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#15]'
     3/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   38.0% (190ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#31]'
     7/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{6_pdQtBLTe2QdeMlFGjY8w}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.844Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   59.6% (298.2ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#22]'
     3/10 snapshots sharing following 20 elements
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     4/10 snapshots sharing following 20 elements
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     3/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   57.7% (288.5ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#17]'
     2/10 snapshots sharing following 29 elements
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:157)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 9 elements
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     6/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   40.9% (204.6ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#37]'
     2/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     6/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     unique snapshot
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState.doReset(CompressingStoredFieldsReader.java:409)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState.reset(CompressingStoredFieldsReader.java:394)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.document(CompressingStoredFieldsReader.java:573)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:583)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:81)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:405)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:406)
       org.elasticsearch.search.fetch.FetchPhase.createSearchHit(FetchPhase.java:203)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:168)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:589)
       org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:408)
       org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:405)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{pmv-1-KNQlK3t6YzyPjS-Q}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.846Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   60.0% (299.8ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#14]'
     3/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     3/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   56.6% (282.8ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#13]'
     5/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   47.7% (238.5ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#21]'
     7/10 snapshots sharing following 36 elements
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$CompressedBinaryDocValues$CompressedBinaryTermsEnum.readHeader(Lucene50DocValuesProducer.java:1116)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$CompressedBinaryDocValues$CompressedBinaryTermsEnum.seekExact(Lucene50DocValuesProducer.java:1258)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$CompressedBinaryDocValues.get(Lucene50DocValuesProducer.java:1066)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$LongBinaryDocValues.get(Lucene50DocValuesProducer.java:1008)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$7.lookupOrd(Lucene50DocValuesProducer.java:629)
       org.apache.lucene.index.SingletonSortedSetDocValues.lookupOrd(SingletonSortedSetDocValues.java:59)
       org.elasticsearch.index.fielddata.FieldData$8.valueAt(FieldData.java:393)
       org.elasticsearch.search.aggregations.bucket.terms.StringTermsAggregator$1.collect(StringTermsAggregator.java:86)
       org.elasticsearch.search.aggregations.LeafBucketCollector.collect(LeafBucketCollector.java:88)
       org.apache.lucene.search.MultiCollector$MultiLeafCollector.collect(MultiCollector.java:145)
       org.apache.lucene.search.TimeLimitingCollector$1.collect(TimeLimitingCollector.java:158)
       org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:218)
       org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:169)
       org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:772)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     3/10 snapshots sharing following 38 elements
       sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
       sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
       sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
       sun.nio.ch.IOUtil.read(IOUtil.java:197)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:741)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$CompressedBinaryDocValues$CompressedBinaryTermsEnum.readHeader(Lucene50DocValuesProducer.java:1116)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$CompressedBinaryDocValues$CompressedBinaryTermsEnum.seekExact(Lucene50DocValuesProducer.java:1258)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$CompressedBinaryDocValues.get(Lucene50DocValuesProducer.java:1066)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$LongBinaryDocValues.get(Lucene50DocValuesProducer.java:1008)
       org.apache.lucene.codecs.lucene50.Lucene50DocValuesProducer$7.lookupOrd(Lucene50DocValuesProducer.java:629)
       org.apache.lucene.index.SingletonSortedSetDocValues.lookupOrd(SingletonSortedSetDocValues.java:59)
       org.elasticsearch.index.fielddata.FieldData$8.valueAt(FieldData.java:393)
       org.elasticsearch.search.aggregations.bucket.terms.StringTermsAggregator$1.collect(StringTermsAggregator.java:86)
       org.elasticsearch.search.aggregations.LeafBucketCollector.collect(LeafBucketCollector.java:88)
       org.apache.lucene.search.MultiCollector$MultiLeafCollector.collect(MultiCollector.java:145)
       org.apache.lucene.search.TimeLimitingCollector$1.collect(TimeLimitingCollector.java:158)
       org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:218)
       org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:169)
       org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:772)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

::: {-}{55aSbiJ4SnSYs3mr_2m6_g}{-}{-:9300}{dc=fra2, master=false}
   Hot threads at 2016-02-16T09:15:14.821Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   51.8% (258.9ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#30]'
     3/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   51.3% (256.5ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#25]'
     3/10 snapshots sharing following 12 elements
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
       java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
       java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
       java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)

   47.6% (238ms out of 500ms) cpu usage by thread 'elasticsearch[-][search][T#27]'
     2/10 snapshots sharing following 34 elements
       sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
       sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
       sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
       sun.nio.ch.IOUtil.read(IOUtil.java:197)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:741)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:157)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 29 elements
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
       org.apache.lucene.store.DataInput.readVInt(DataInput.java:125)
       org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:157)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 22 elements
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:201)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:233)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 31 elements
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:737)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:140)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:176)
       org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:507)
       org.apache.lucene.queries.TermsQuery$1.rewrite(TermsQuery.java:283)
       org.apache.lucene.queries.TermsQuery$1.scorer(TermsQuery.java:347)
       org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:605)
       org.elasticsearch.indices.cache.query.IndicesQueryCache$CachingWeightWrapper.scorer(IndicesQueryCache.java:262)
       org.elasticsearch.common.lucene.search.function.FiltersFunctionScoreQuery$CustomBoostFactorWeight.scorer(FiltersFunctionScoreQuery.java:186)
       org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:274)
       org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
       org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:256)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:769)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:324)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:106)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:363)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:375)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368)
       org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:299)
       org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
     2/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       java.lang.Thread.run(Thread.java:745)
"

Thanks!

SeoJueun commented 8 years ago

@jpountz As you said, query is quite fast with small index. But I can get better performance with filter caching even if index is small. I did experiment to compare performance between index.queries.cache.everything true and false in our production environment. true version comes with about 30% less cpu consumption than false version. Since index is really small, our elasticsearch cluster is not memory bounded but cpu bounded. It would be really nice if I can utilize enough memory with filter caching. I know my environment is one of very rare case but I think elasticsearch can be used as a perfect advertisements targeting platform as well.

lmenezes commented 8 years ago

@SeoJueun actually, we also use it for ads. We just haven't upgraded this particular cluster yet, but I share the concerns in this case too :)

jpountz commented 8 years ago

For the record, here's one change that should help already: #16851.

jrots commented 8 years ago

Are queries executing faster again in 2.3 with the fixes of @jpountz, has anyone upgraded and seen improvements? I'm still running 2.2 with

index.queries.cache.everything: true

in production, but a bit unsure to just make the switch without this settings if things would be slow again.

jrots commented 8 years ago

Ok upgraded to 2.3.1 and removed index.queries.cache.everything: true. Search latency is lower and I can achieve more searches per sec while load stays the same. So things are a lot better then before, thx for the fixes @jpountz

jpountz commented 8 years ago

@jrots you are welcome! Thank you for the feedback, I will close this issue now.

amanullah92 commented 5 years ago

@jpountz - Would it make sense to tell it upfront in the documentation that term queries are not cached now? See this link https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_exact_values.html Created a pull request here: https://github.com/elastic/elasticsearch-definitive-guide/pull/781