elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
927 stars 24.82k forks source link

Circuit breaking exceptions thrown by transport layer are hard to follow #100887

Open javanna opened 1 year ago

javanna commented 1 year ago

I have been getting circuit breaking exception when updating index templates, as well as when modifying a data stream. The error returned makes a user think that something is wrong with their request, that it may have been too big or something along those lines. In reality, retrying the same request later works fine in most cases.

We should try to improve the error to make it more actionable for users: do we expect them to retry their request? Shall we be clearer around the fact that nothing is wrong with their request, but something else is causing the error which may need attention?

{
  "error": {
    "root_cause": [
      {
        "type": "circuit_breaking_exception",
        "reason": "[parent] Data too large, data for [indices:admin/data_stream/modify] would be [4071282682/3.7gb], which is larger than the limit of [3900912435/3.6gb], real usage: [4071282016/3.7gb], new bytes reserved: [666/666b], usages [model_inference=0/0b, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=1260/1.2kb]",
        "bytes_wanted": 4071282682,
        "bytes_limit": 3900912435,
        "durability": "TRANSIENT",
        "stack_trace": """org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [indices:admin/data_stream/modify] would be [4071282682/3.7gb], which is larger than the limit of [3900912435/3.6gb], real usage: [4071282016/3.7gb], new bytes reserved: [666/666b], usages [model_inference=0/0b, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=1260/1.2kb]
    at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:414)
    at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:109)
    at org.elasticsearch.transport.InboundAggregator.checkBreaker(InboundAggregator.java:215)
    at org.elasticsearch.transport.InboundAggregator.finishAggregation(InboundAggregator.java:119)
    at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:121)
    at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:96)
    at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:61)
    at org.elasticsearch.transport.netty4.Netty4MessageInboundHandler.channelRead(Netty4MessageInboundHandler.java:48)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1383)
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1246)
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1295)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at java.lang.Thread.run(Thread.java:1583)
"""
      }
    ],
    "type": "circuit_breaking_exception",
    "reason": "[parent] Data too large, data for [indices:admin/data_stream/modify] would be [4071282682/3.7gb], which is larger than the limit of [3900912435/3.6gb], real usage: [4071282016/3.7gb], new bytes reserved: [666/666b], usages [model_inference=0/0b, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=1260/1.2kb]",
    "bytes_wanted": 4071282682,
    "bytes_limit": 3900912435,
    "durability": "TRANSIENT",
    "stack_trace": """org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [indices:admin/data_stream/modify] would be [4071282682/3.7gb], which is larger than the limit of [3900912435/3.6gb], real usage: [4071282016/3.7gb], new bytes reserved: [666/666b], usages [model_inference=0/0b, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=1260/1.2kb]
    at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:414)
    at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:109)
    at org.elasticsearch.transport.InboundAggregator.checkBreaker(InboundAggregator.java:215)
    at org.elasticsearch.transport.InboundAggregator.finishAggregation(InboundAggregator.java:119)
    at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:121)
    at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:96)
    at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:61)
    at org.elasticsearch.transport.netty4.Netty4MessageInboundHandler.channelRead(Netty4MessageInboundHandler.java:48)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1383)
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1246)
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1295)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at java.lang.Thread.run(Thread.java:1583)
"""
  },
  "status": 429
}
elasticsearchmachine commented 1 year ago

Pinging @elastic/es-distributed (Team:Distributed)

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-core-infra (Team:Core/Infra)