Closed sunnycalsoft closed 1 year ago
This looks like a driver issue. @jyemin what do you think?
Yes, this is driver-related, but it's not clear what the effect is on the application. The stack trace in the description is from an exception thrown from a connection pool-managed background thread and logged by the framework . @sunnycalsoft are operations also failing in your application, and if so can you post any stack traces you have from those threads?
@jyemin yes the problem is that even health check failed for mongodb in the application and application gets restarted. We have provided 200 max thread pool and 100 min thread pool as well should be goood enough for the application.
{"time":"2022-10-12T19:11:29,701Z","level":"ERROR","thread":"SNOW-SCHEDULER2","logger":"com.radisys.macs.cems.configmgmt.topology.statemachine.scheduler.TopologyNodeStateWatcher","message":"Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.; nested exception is com.mongodb.MongoTimeoutException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.","context":"default","exception":"com.google.common.flogger.LogSiteStackTrace: SMALL\n\tat com.radisys.macs.cems.configmgmt.topology.statemachine.scheduler.TopologyNodeStateWatcher.raiseDiscoveryEventIfLockedForMaxTime(TopologyNodeStateWatcher.java:146)\n\tat jdk.internal.reflect.GeneratedMethodAccessor334.invoke(Unknown Source)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)\n\tat java.base/java.lang.reflect.Method.invoke(Unknown Source)\n\tat org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)\n\tat org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\n\tat java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)\n\tat java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\nCaused by: org.springframework.dao.DataAccessResourceFailureException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.; nested exception is com.mongodb.MongoTimeoutException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.\n\tat org.springframework.data.mongodb.core.MongoExceptionTranslator.translateExceptionIfPossible(MongoExceptionTranslator.java:95)\n\tat org.springframework.data.mongodb.core.ReactiveMongoTemplate.potentiallyConvertRuntimeException(ReactiveMongoTemplate.java:2954)\n\tat org.springframework.data.mongodb.core.ReactiveMongoTemplate.lambda$translateException$89(ReactiveMongoTemplate.java:2937)\n\tat reactor.core.publisher.Mono.lambda$onErrorMap$31(Mono.java:3733)\n\tat reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94)\n\tat reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192)\n\tat reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259)\n\tat reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93)\n\tat reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192)\n\tat reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259)\n\tat reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onError(MonoPeekTerminal.java:258)\n\tat reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)\n\tat com.mongodb.reactivestreams.client.internal.MongoOperationPublisher.lambda$sinkToCallback$30(MongoOperationPublisher.java:545)\n\tat com.mongodb.reactivestreams.client.internal.OperationExecutorImpl.lambda$execute$2(OperationExecutorImpl.java:92)\n\tat com.mongodb.internal.operation.CountDocumentsOperation.lambda$executeAsync$1(CountDocumentsOperation.java:139)\n\tat com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:48)\n\tat com.mongodb.internal.async.function.AsyncCallbackSupplier.lambda$whenComplete$1(AsyncCallbackSupplier.java:95)\n\tat com.mongodb.internal.async.function.RetryingAsyncCallbackSupplier$RetryingCallback.onResult(RetryingAsyncCallbackSupplier.java:109)\n\tat com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:48)\n\tat com.mongodb.internal.async.function.AsyncCallbackSupplier.lambda$whenComplete$1(AsyncCallbackSupplier.java:95)\n\tat com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:48)\n\tat com.mongodb.internal.operation.OperationHelper.lambda$withAsyncSuppliedResource$7(OperationHelper.java:649)\n\tat com.mongodb.internal.connection.DefaultServer$1.onResult(DefaultServer.java:130)\n\tat com.mongodb.internal.connection.DefaultServer$1.onResult(DefaultServer.java:122)\n\tat com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:48)\n\tat com.mongodb.internal.connection.DefaultConnectionPool.lambda$getAsync$0(DefaultConnectionPool.java:200)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$OpenConcurrencyLimiter.openAsyncWithConcurrencyLimit(DefaultConnectionPool.java:944)\n\tat com.mongodb.internal.connection.DefaultConnectionPool.lambda$getAsync$1(DefaultConnectionPool.java:231)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$Task.doComplete(DefaultConnectionPool.java:1385)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$Task.execute(DefaultConnectionPool.java:1371)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.workerRun(DefaultConnectionPool.java:1311)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.runAndLogUncaught(DefaultConnectionPool.java:1342)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$AsyncWorkManager.lambda$initUnlessClosed$0(DefaultConnectionPool.java:1275)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\n\tat java.base/java.util.concurrent.FutureTask.run(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n\tSuppressed: java.lang.Exception: #block terminated with an error\n\t\tat reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)\n\t\tat reactor.core.publisher.Mono.block(Mono.java:1707)\n\t\tat com.radisys.macs.cems.configmgmt.topology.statemachine.repository.TopologyDaoRepositoryImpl.getTopologyCount(TopologyDaoRepositoryImpl.java:346)\n\t\tat com.radisys.macs.cems.service.PersistenceServiceImpl.getTopologyCount(PersistenceServiceImpl.java:387)\n\t\tat com.radisys.macs.cems.configmgmt.topology.statemachine.scheduler.TopologyNodeStateWatcher.getAllTopologyNodeCountWithDiscoveringState(TopologyNodeStateWatcher.java:159)\n\t\tat com.radisys.macs.cems.configmgmt.topology.statemachine.scheduler.TopologyNodeStateWatcher.raiseDiscoveryEventIfLockedForMaxTime(TopologyNodeStateWatcher.java:134)\n\t\tat jdk.internal.reflect.GeneratedMethodAccessor334.invoke(Unknown Source)\n\t\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)\n\t\tat java.base/java.lang.reflect.Method.invoke(Unknown Source)\n\t\tat org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)\n\t\tat org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)\n\t\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\n\t\tat java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)\n\t\tat java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)\n\t\t... 3 common frames omitted\nCaused by: com.mongodb.MongoTimeoutException: Timed out after 120000 ms while waiting for a connection to server psmdb-db-mongos.mongodb.svc.cluster.local:27017.\n\tat com.mongodb.internal.connection.DefaultConnectionPool.createTimeoutException(DefaultConnectionPool.java:330)\n\tat com.mongodb.internal.connection.DefaultConnectionPool.access$2000(DefaultConnectionPool.java:106)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$OpenConcurrencyLimiter.acquirePermitOrGetAvailableOpenedConnection(DefaultConnectionPool.java:1003)\n\tat com.mongodb.internal.connection.DefaultConnectionPool$OpenConcurrencyLimiter.openAsyncWithConcurrencyLimit(DefaultConnectionPool.java:941)\n\t... 11 common frames omitted\n","correlation-id":"da10fd6c-4a57-11ed-9fbc-8376975cee80","source-ms":"","user-id":"","user-role":"","additional-info":""}
@sunnycalsoft The presented com.mongodb.MongoTimeoutException
s originating from DefaultConnectionPool
don't by themselves signify a driver problem. They happen when the pool is asked for a connection, but does not have enough available non-perished connections. When this happens, the pool tries to create and establish a new connection. The former is constrained by pool's [max size](https://mongodb.github.io/mongo-java-driver/4.7/apidocs/mongodb-driver-core/com/mongodb/connection/ConnectionPoolSettings.html#getMaxSize()), the latter is constrained by pool's [max connecting limit](https://mongodb.github.io/mongo-java-driver/4.7/apidocs/mongodb-driver-core/com/mongodb/connection/ConnectionPoolSettings.html#getMaxConnecting()). The stack traces have the acquirePermitOrGetAvailableOpenedConnection
frame, which tells that the activity of checking out a connection has passed the max size barrier (though it is unknown how much of the checkout timeout it spent there, maybe most of it), but failed to pass the max connecting barrier within the rest of the timeout (passing this barrier happens either because the pool allows the checkout request to establish a connection, or because a different connection was checked in the pool and was not checked out by a different request that had also been waiting on this barrier).
I can think of two situations that may cause the exception:
checkedOutCount
, size
, maxSize
via JMX as described here. If the pool actually operates at / close to its capacity, you may try to increase the max size. Hopefully the above will help. If it won't, and your company has support from MongoDB, then, of course, you may want to create a support case.
This ticket relates to MongoDB driver internals and nothing that we can handle from the Spring Data side.
Getting this error after application run 30-40 minutes then it restarts and application starts working