Azure / azure-cosmosdb-java

Java Async SDK for SQL API of Azure Cosmos DB
MIT License
54 stars 61 forks source link

fixIllegalStateException #395

Closed xinlian12 closed 2 years ago

xinlian12 commented 2 years ago

Exception: java.lang.IllegalStateException: null at com.google.common.base.Preconditions.checkState(Preconditions.java:491) ~[guava-27.0.1-jre.jar!/:na] at com.microsoft.azure.cosmosdb.internal.directconnectivity.rntbd.RntbdClientChannelPool.decrementAndRunTaskQueue(RntbdClientChannelPool.java:443) ~[azure-cosmosdb-direct-2.6.13.jar!/:na] at com.microsoft.azure.cosmosdb.internal.directconnectivity.rntbd.RntbdClientChannelPool.access$600(RntbdClientChannelPool.java:66) ~[azure-cosmosdb-direct-2.6.13.jar!/:na] at com.microsoft.azure.cosmosdb.internal.directconnectivity.rntbd.RntbdClientChannelPool$AcquireListener.fail(RntbdClientChannelPool.java:609) ~[azure-cosmosdb-direct-2.6.13.jar!/:na] at com.microsoft.azure.cosmosdb.internal.directconnectivity.rntbd.RntbdClientChannelPool$AcquireListener.lambda$null$0(RntbdClientChannelPool.java:595) ~[azure-cosmosdb-direct-2.6.13.jar!/:na] at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511) [netty-common-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504) [netty-common-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483) [netty-common-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424) [netty-common-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121) [netty-common-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64) [netty-common-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.DelegatingChannelPromiseNotifier.operationComplete(DelegatingChannelPromiseNotifier.java:57) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.DelegatingChannelPromiseNotifier.operationComplete(DelegatingChannelPromiseNotifier.java:31) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractCoalescingBufferQueue.releaseAndCompleteAll(AbstractCoalescingBufferQueue.java:340) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractCoalescingBufferQueue.releaseAndFailAll(AbstractCoalescingBufferQueue.java:207) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at com.microsoft.azure.cosmosdb.internal.directconnectivity.rntbd.RntbdRequestManager.completeAllPendingRequestsExceptionally(RntbdRequestManager.java:617) [azure-cosmosdb-direct-2.6.13.jar!/:na] at com.microsoft.azure.cosmosdb.internal.directconnectivity.rntbd.RntbdRequestManager.exceptionCaught(RntbdRequestManager.java:312) [azure-cosmosdb-direct-2.6.13.jar!/:na] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.ChannelHandlerAdapter.exceptionCaught(ChannelHandlerAdapter.java:87) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireExceptionCaught(CombinedChannelDuplexHandler.java:426) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.ChannelHandlerAdapter.exceptionCaught(ChannelHandlerAdapter.java:87) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.CombinedChannelDuplexHandler$1.fireExceptionCaught(CombinedChannelDuplexHandler.java:147) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.CombinedChannelDuplexHandler.exceptionCaught(CombinedChannelDuplexHandler.java:233) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.handler.ssl.SslHandler.exceptionCaught(SslHandler.java:1074) [netty-handler-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1375) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:918) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:125) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:174) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:677) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:612) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:529) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:491) [netty-transport-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905) [netty-common-4.1.33.Final.jar!/:4.1.33.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.33.Final.jar!/:4.1.33.Final] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]

Root Cause:

RntbdClientChannelPool -> close due to no active request after 70s -> set acquiredChannelsCount to 0 newConnection -> established -> RntbdContext negotiation -> timed out (took more than 70s) -> fail request -> decrease acquiredChannelsCount -> throw IllegalStateException if acquiredChannelsCount < 0

Change

  1. Change default IdleEndpointTimeout from 70s to 1h
  2. only decrease acquiredChannelsCount if RntbdChannelPool still active

Local tests Mock operation timed out during RntbdContext negotiation

before the fix: image

after the fix: image

simplynaveen20 commented 2 years ago

https://github.com/Azure/azure-cosmosdb-java/pull/395 merged